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(57) Abstract 



The present invention provides a method of delivering immunogenic or therapeutic proteins to bone marrow cells using alphavirus 
vectors. The alphavirus vectors disclosed herein target specifically to bone marrow tissue, and viral genomes persist in bone marrow for 
at least three months post-infection. No or very low levels of virus were detected in quadricep. brain, and sera of treated animals. The 
sequence of a consensus Sindbis cDNA clone, pTR339, and infectious RNA transcripts, infectious virus particles, and pharmaceutical 
fomtiulations derived therefrom are also disclosed. The sequence of the genomic RNA of the Girdwood S.A. virus, and cDNA clones, 
infectious RNA transcripts, infectious virus particles, and pharmaceutical fomiulations derived therefrom are also disclosed. 
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SYSTEM FOR THE IN VIVO DELIVERY AND 
EXPRESSION OF HETEROLOGOUS GENES IN 
THE BONE MARROW 

FEDERALLY SPONSORED RESEARCH 
This invention was made with Government support under Grant 
Nximber 5 ROl AL22186 from the National Institutes of Health. The Government 
has certain rights to this invention. 

FIELD OF THE INVENTION 
The present invention relates to recombinant DNA technology, and in 
particular to introducing and expressing foreign DNA in a eukaiyotic cell. 

BACKGROUND OF THE INVENTION 
The Alphavirus genus includes a variety of viruses all of which are 
members of the Togaviridae family. The alphaviruses include Eastern Equine 
Encephalitis virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades 
vims, Mucambo virus, Pixuna virus. Western Equine Encephalitis virus (WEE), 
Sindbis virus. South African Arbovirus No. 86 (S.A.AR 86), Girdwood S.A. 
virus, Ockelbo virus, Semliki Forest virus, Middelburg virus, Chikungunya virus, 
O'Nyong-Nyong virus, Ross River virus, Barmah Forest virus, Getah virus, 
Sagiyama virus, Bebam virus, Mayaro viras, Una virus. Aura virus, Whataroa 
virus, Babanki vims, Kyzylagach virus, Highlands J virus, Fort Morgan virus, 
Ndumu virus, and Buggy Creek vims. 
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The alphavirus genome is a single-stranded, messenger-sense RNA, 
modified at the 5*-end with a methylated cap, and at the 3'-end with a variable- 
length poly (A) tract. The viral genome is divided into two regions: the first 
encodes the nonstructural or replicase proteins (nsPl-nsP4) and the second encodes 
the viral structural proteins. Strauss and Strauss, Microbiological Rev. 58, 491- 
562, 494 (1994). Structural subunits consisting of a single viral protein, C, 
associate with themselves and with the RNA genome in an icosahedial 
nucleocapsid. In the virion, the capsid is surrounded by a lipid envelope covered 
with a regular array of transmembranal protein spikes, each of which consists of 
a heterodimeric complex of two glycoproteins. El and E2. See Paredes et al.. 
Proc. Natl. Acad. Sci. USA 90, 9095-99 (1993); Paredes et al.. Virology 187, 324- 
32 (1993); Pedersen et al., J. Virol. 14:40 (1974). 

Sindbis virus, the prototype member of the alphavirus genus of the 
fanuly Togaviridae, and viruses related to Sindbis are broadly distributed 
throughout Africa, Europe, Asia, the Indian subcontinent, and Australia, based on 
serological surveys of humans, domestic animals and wild birds. Kokemot et al.. 
Trans. R. Sac. Trop Med. Hyg. 59. 553-62 (1965); Redaksie, S. Afr. Med. J. 42, 
197 (1968); Adekolu-John and Fagbami, Trans. R. Soc. Trop. Med. Hyg. 77, 149- 
51 (1983); Darwish et al.. Trans. R. Soc. Trop. Med. Hyg. 77, 442-45 (1983); 
Lundstrom et al., Epidemiol. Infect. 106, 567-74 (1991); Morrill et al., /. Trop. 
Med. Hyg. 94, 166-68 (1991). The first isolate of Sindbis virus (strain AR339) 
was recovered from a pool of Culex sp. mosquitoes collected in Sindbis, Egypt in 
1953 (Taylor et al.. Am. J. Trop. Med. Hyg. 4, 844-62 (1955)), and is the most 
extensively studied representative of this group. Other members of the Sindbis 
group of alphaviruses include South African Arbovirus No. 86, Ockelbo82, and 
Girdwood S.A. These viruses are not strains of the Sindbis virus; they are related 
to Sindbis AR339, but they are more closely related to each other based on 
nucleotide sequence and serological comparisons. LundstrSm et al., J. Wildl. Dis. 
29, 189-95 (1993); Simpson et al.. Virology 222, 464-69 (1996). Ockelbo82, 
S.A.AR86 and Gh-dwood S.A. are all associated widi human disease, whereas 
Sindbis is not. The clinical symptoms of human infection with Ockelbo82, 
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S.A.AR86, or Girdwood S.A. are a febrile illness, general malaise, macropapular 
rash, and joint pain that occasionally progresses to a polyarthralgia sometimes 
lasting from a few months to a few years. 

The study of these viruses has led to the development of beneficial 
5 techniques for vaccinating against the alphavirus diseases, and other diseases 

through the use of alphavirus vectors for the introduction of foreign DNA. See 
United States Patent No. 5,185,440 to Davis et al.. and PCX Publication WO 
92/10578. It is mtended that all United States patent references be incorporated 
in their entirety by reference. 

10 It is well known that live, attenuated viral vaccines are among the 

most successful means of controlling viral disease. However, for some virus 
pathogens, immunization with a live virus strain may be either impractical or 
unsafe. One alternative strategy is the insertion of sequences encoding immunizing 
antigens of such agents into a vaccine strain of another virus. One such system 

15 utilizing a live VEE vector is described in United States Patent No. 5,505,947 to 

Johnston et al. 

Sindbis vims vaccines have been employed as viral carriers in virus 
constructs which express genes encoding immunizing antigens for other viruses. 
See United States Patent No. 5,217,879 to Huang et al. Huang et al. describes 
2t) Sindbis infectious viral vectors. However, the reference does not describe the 

cDNA sequence of Girdwood S.A. and TR339, nor clones or viral vectors 
produced therefrom. 

Another such system is described by Hahn et al., Proc. Natl. Acad. 
ScL USA 89:2679 (1992), wherein Sindbis vims constracts which express a 
25 truncated form of the influenza hemagglutinin protein are described. The 
constructs are used to study antigen processing and presentation in vitro and in 
mice. Although no infectious challenge dose is tested, it is also suggested that 
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such constructs might be used to produce protective B- and T-cell mediated 
immunity. 

London et al.. Proc, Natl. Acad: Sci, USA 89. 207-11 (1992). 
disclose a method of producing an immune response in mice against a lethal Rift 
VaUey Fever (RVF) virus by infecting the mice with an infectious Sindbis virus 
containing an RVF epitope. London does not disclose using Girdwood S.A. or 
TR339 to induce an unmune response in animals. 

Viral carriers can also be used to introduce and express foreign 
DNA in eukaryotic ceUs. One goal of such techniques is to employ vectors that 
target expression to particular cells and/or tissues. A current approach has been 
to remove target ceUs ftom the body, culture them ex vivo, infect them with an 
expression vector, and then remtroduce them into the patient. 

PCT PubUcation No. WO 92/10578 to Garoff and Liljestrom 
provide a system for introducmg and expressing foreign proteins in animal cells 
using alphaviruses. This reference discloses the use of Semliki Forest virus to 
introduce and express foreign proteins in animal cells. The use of Girdwood S.A. 
or TR339 is not discussed. Furthermore, this reference does not provide a method 
of targeting and introducing foreign DNA into specific cell or tissue types. 

Accordingly, there remains a need in the art for fiill-length cDNA 
clones of positive-strand RNA viruses, such as Girdwood S.A and TR339. In 
addition, there is an ongoing need m the art for unproved vaccination strategies. 
Finally, there remams a need in the art for improved methods and nucleic acid 
sequences for delivering foreign DNA to target cells. 



SUMMARV OF THE INVENTTO^f 
A first aspect of the present invention is a method of introducing 
and expressing heterologous RNA in bone marrow cells, comprising: (a) providing 
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a recombinant alphavirus, the alphavirus containing a heterologous RNA segment, 
the heterologous RNA segment comprising a promoter operable in bone marrow 
cells operatively associated with a heterologous RNA to be expressed in bone 
marrow cells; and then (b) contacting the recombinant alphavirus to the bone 
marrow cells so that the heterologous RNA segment is introduced and expressed 
therein. 

As a second aspect, the present invention provides a helper ceU for 
expressing an infectious, propagation defective, Girdwood S.A. virus particle, 
comprising, in a Girdwood S.A.-permissive cell: (a) a first helper RNA encoding 
(i) at least one Girdwood S.A. strucmrai protein, and (U) not encoding at least one 
other Girdwood S.A. strucnirai protein; and (b) a second helper RNA separate 
from the first helper RNA. the second helper RNA (i) not encoding the at least one 
Girdwood S.A. structural protein encoded by the first helper RNA. and (ii) 
encoding the at least one other Girdwood S.A. structural protein not encoded by 
the first helper RNA. and with all of the Girdwood S.A. structural proteins 
encoded by the first and second helper RNAs assembling together into Girdwood 
S.A. particles in the cell containing the replicon RNA; and wherein the Girdwood 
S.A. packaging segment is deleted from at least the first helper RNA. 

A third aspect of flie present invention is a mefliod of making 
infectious, propagation defective. Girdwood S.A. virus particles, comprising: 
transfecting a Girdwood S. A.-permissive cell with a propagation defective repUcon 
RNA, the repUcon RNA includmg the Girdwood S.A. packaging segment and an 
inserted heterologous RNA; producing the Girdwood S.A. virus particles in the 
transfected cell; and then coUecting the Girdwood S.A. virus particles from the 
cell. Also disclosed are infectious Girdwood S.A. RNAs, cDNAs encoding the 
same, infectious Girdwood S.A. virus particles, and pharmaceutical formulations 
thereof. 

As a fourth aspect, the present invention provides a helper cell for 
expressing an mfectious. propagation defective, TR339 virus particle, comprismg. 
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in a TR339-pennissive cell: (a) a first helper RNA encoding (i) at least one TR339 
structural protein, and (ii) not encoding at least one other TR339 structural protein; 
and (b) a second helper RNA separate from the first helper RNA, the second 
helper RNA (i) not encoding the at least one TR339 structural protein encoded by 
the furst helper RNA. and (ii) encoding the at least one other TR339 structutal 
protein not encoded by the first helper RNA. and with aU of the TR339 structural 
protems encoded by the furst and second helper RNAs assembling together into 
TR339 particles in the cell containing the replicon RNA; and wherein the TR339 
packaging segment is deleted from at least the first helper RNA. 

A fifth aspect of the present invention is a method of making 
infectious, propagation defective. TR339 virus particles, comprising: transfecting 
a TR339-pemiissive cell with a propagation defective replicon RNA, the repUcon 
RNA inchiding the TR339 packaging segment and an inserted heterologous RNA; 
producing the TR339 virus particles in the transfected cell; and then coUecting th^ 
TR339 virus particles from the ceU. Also disclosed are infectious TR339 RNAs, 
cDNAs encoding the same, infectious TR339 vmis particles, and pharmaceutical 
formulations thereof. 

As a sbcth aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for .an mfectious Girdwood S.A. virus RNA 
transcript, and a heterologous promoter positioned upstream from the cDNA and 
operatively associated therewith. The present mvention also provides infectious 
RNA transcripts encoded by the above-mentioned cDNA and infectious viral 
particles containing the infectious RNA transcripts. 

As a seventh aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for a Sindbis strain TR339 RNA transcript, and 
a heterologous promoter positioned upstream from the cDNA and operatively 
associated therewith. The present invention also provides mfectious RNA 
transcripts encoded by the above-mentioned cDNA and infectious viral particles 
containing the infectious RNA transcripts. 
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The foregoing and other aspects of the present invention are 
described in the detailed description set forth below. 

BRIEF DESCRIPTTON OF THE AWINng 
Figure 1 presents the cDNA sequence (SEQ ID NO:l) of 
S.A.AR86. The RNA sequence of the 5' 40 nucleotides was obtamed by direct 
sequencing of the genomic RNA. The rest of the genome was sequenced by RT- 
PCR of fragments amplified from virion RNA. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7559 (nsPl--nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
nt4100 through nt5729; nsP4-nt5730 through nt7559), the structural pol^rotem 
is encoded by nucleotides 7608 through 11342 (capsid--nt7608 through nt8399; E3- 
-nt8400 through nt8591; E2-nt8592 through nt9860; 6K~.nt9861 through ntl0025; 
El--ntl0026 through ntll342). and the 3' UTR is represented by nucleotides 
11346 through 11663. 

ilgure I A shows nucleotides 1 through 3800 of the cDNA sequence 

of S.A.AR86. 

Figure IB shows nucleotides 3801 through 7900 of the cDNA 
sequence of S.A.AR86. 

Figure IC shows nucleotides 7901 through 11663 of the cDNA 
sequence of S.A.AR86. 

Figure 2 presents the putative amino acid sequences of the 
S.A.AR86 polyprotems (SEQ ID NO:2 and SEQ ID NO:3). The amino acids 
were derived from the S.A.AR86 cDNA sequence given in Figure 1 (SEQ ID 
NO:l). 
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Figure 2A shows the amino acid sequence of the nonrstractural 
polyprotein of S.A.AR86 (SEQ ID NO:2). 

Figure 2B shows the amino acid sequence of the structural 
polyprotein of S.A.AR86 (SEQ ID N0:3). 

Figure 3 presents the cDNA sequence (SEQ ID NO:4) of Girdwood 
S.A. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome sequence was obtained 
by sequencing of fragments amplified by RT-PCR from virion RNA. An "N" in 
the sequence indicates that the identity of the nucleotide at that position is 
unknown. Nucleotides 1 through 59 represent the 5' UTR, the non-structural 
polyprotein is encoded by nucleotides 60 through 7613 (nsPl-nt60 through 
ntl679; nsP2-ntl680 through nt4099; nsP3-nt4100 through nt5762 or nt5783; 
nsP4-nt5784 through nt7613), the structural polyprotein is encoded by nucleotide^ 
7662 through 11396 (capsid-nt7662 through nt8453; E3~nt8454 through nt8645; 
E2-nt8646 through nt9914, 6K~9915 through ntl0079; El-ntlOOSO through 
ntll396), and the 3' UTR is represented by nucleotides 11400 through 11717. 
There is an opal terminaaon codon at nucleotides 5763 through 5765. 

Figure 3 A shows nucleotides 1 through 3800 of the cDNA sequence 
of Girdwood S.A. 

Figure 3B shows nucleotides 3801 through 7900 of the cDNA 
sequence of Gurdwood S.A. 

Figure 3C shows nucleotides 7901 through 11717 of the cDNA 
sequence of Gkdwood S.A. 

Figure 4 iUustrates the putative amino acid sequences of the 
Girdwood S.A. polyprotems (SEQ ED NO:5 and SEQ ID NO:6). The 



ammo 



SUBSTITUTE SHEET (RULE 26) 



wo 98/36779 PCTAJS98/0294S 

1 * 

-9- 

acids were derived from the Girdwood S.A, cDNA sequence given in Figure 3 
(SEQ ID NO:4). 

figure 4A shows the amino acid sequence of the non-structural 
polyprotein of Girdwood S.A. The sequence terminates at the opal termination 
codon. The complete amino acid sequence is presented in SEQ ID NO:5. 

Figure 4B shows the amino acid sequence of the structural 
polyprotein of Girdwood S.A, (SEQ ID NO:6), 

Figure 5 illustrates the nucleotide sequence (SEQ ID NO:7) of 
clone pS55, a cDNA clone of the S.A.AR86 genomic RNA, 

Figure 5A shows nucleotides 1 through 6720 of the cDNA sequence 

ofpS55. 

Figure SB shows nucleotides 6721 through 11663 of the cDNA 
sequence of pS55. 

Figure 6 presents the cDNA sequence (SEQ ID NO:8) of clone 
pTR339. The TR339 vmis is derived from this clone. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7598 (nsPl-nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
nt410d through nt5747 or 5768; nsP4-nt5769 through nt7598), the stmctural 
polyprotein is encoded by nucleotides 7647 through 11381 (capsid-nt7647 through 
nt8438; E3-.nt8439 through nt8630; E2-.nt8631 tiirough nt9899; 6K-nt9900 
through ntl0064; El--ntl0065 through ntll381), and tiie 3' UTR is represented 
by nucleotides 11382 through 11703. There is an opal termination codon at 
nucleotides 5748 tiurough 5750. 

Figure 6A shows nucleotides 1 through 6720 of the cDNA sequence 

of pTR339. 
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Figure 6B shows nucleotides 6721 through 11703 of the cDNA 
sequence of pTR339. 

DETAILED DFJ^r RIPTTON OF THE INVRNTTnM 
The production and use of recombinant DNA, vectors, transfonned 
5 host cells, selectable markers, proteins, and protein fragments by genetic 

engineering are well-known to those skiUed in the art. See, e.g.. United States 
Patent No- 4,761.371 to Bell et al. at Col. 6 line 3 to CoL 9 line 65; United States 
Patent No. 4,877, 729 to Clark et al. at CoL 4 line 38 to Col. 7 line 6; United 
States Patent No. 4,912,038 to Schilling at Col 3 line 26 to Col 14 line 12; and 
10 United States Patent No. 4,879,224 to Walkier at CoL 6 line 8 to CoL 8 line 59. 

The term "alphavinis" has its conventional meaning in the art, and 
includes the various species of alphaviruses such as Eastern Equine Encephalitis 
virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades virus, 
Mucambo virus, Pixuna virus, Western Encephalitis virus (WEE), Sindbis virus, 

15 South African Arbovirus No. 86, Girdwood S.A. vuiis, Ockelbo virus, Semliki 

Forest virus, Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross 
River virus, Bannah Forest virus, Getah virus, Sagiyama virus, Bebaru virus, 
Mayaro virus, Una virus. Aura vuus, Whataroa virus, Babanki virus, Kyzlagach 
virus. Highlands J virus. Fort Morgan virus, Ndumu virus. Buggy Creek virus, 

20 and any other virus classified by the International Committee on Taxonomy of 

Vhiises (ICTV) as an alphavinis. The preferred alphaviruses for use in the present 
mvention include Sindbis virus strains {e.g. , TR339), Girdwood S.A., S.A.AR86, 
and Ockelbo82. 

An "Old World alphavinis" is a virus that is primarily distributed 
25 throughout the Old World. Alternately stated, an Old World alphavirus is a virus 
that is primarily distributed throughout Africa, Asia, Australia and New Zealand, 
or Europe. Exemplary Old World viruses include SF group alphaviruses and SIN 
group alphaviruses. SF group alphaviruses include Semliki Forest virus, 
Middelburg virus, Chikungunya virus, O'Nyong-Nyong vuus, Ross River virus, 
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Bannah Forest virus, Getah virus, Sagiyama virus, Bebaru virus, Mayaro virus, 
and Una vims. SIN group alphaviruses include Sindbis virus. South African 
Arbovirus No. 86, Ockelbo virus. Girdwood S.A. virus, Aura virus, Whataroa 
virus, Babanki virus, and Ky:qrlagach virus. 

5 Acceptable alphaviruses include those containing attenuating 

mutations. The phrases "attenuating mutation" and "attenuating amino acid," as 
used herein, mean a nucleotide sequence containing a mutation, or an amino acid 
encoded by a nucleotide sequence containmg a mutation, which mutation results 
in a decreased probability of causing disease in its host (i.e. , a loss of virulence), 
10 m accordance with standard terminology in the art, whether the mutation be a 

substitution mutation or an in-frame deletion mutation. See. e.g., B. DAVIS ET 
AL.. MICROBIOLOGY 132 (3d ed. 1980). The phrase "attenuating mutation" 
excludes mutations or combmations of mutations which would be lethal to the 
virus. 

1^ Appropriate attenuating mutations will be dependent upon the 

alphavmis used. Suitable attenuating mutations within the alphavhus genome will 
be known to those skilled in the art. Exemplary attenuating mutations include, but 
are not limited to, those described in United States Patent No. 5,505,947 to 
Johnston et al., copending United States appUcation 08/448,630 to Johnston et al., 

20 and copending United States application 08/446,932 to Johnston et al. It is 

intended that all United States patent references be incorporated m thek entirety 
by reference. 



Attenuating mutations may be introduced into the RNA by 
performing site-directed mutagenesis on the cDNA which encodes the RNA, in 
accordance with known procedures. See, Kunkel, Proc. Natl. Acad. ScL USA 82, 
488 (1985), the disclosure of which is mcorporated herein by reference in its 
entirety. Alternatively, mutations may be introduced into the RNA by replacement 
of homologous restriction fragments in the cDNA which encodes for the RNA. in 
accordance with known procedures. 
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I. Methods fgr Tntrodnrma and Evnr^.cinp HPt..rnlnp n.. ^ rva fn n^^ . 
Marrow Cells. 

The present invention provides methods of using a recombinant 
alphavirus to introduce and express a heterologous RNA in bone marrow cells. 
Such methods are useful as vaccination strategies when the heterologous RNA 
encodes an immunogenic protein or peptide. Alternatively, such methods are 
useful in introducing and expressing in bone marrow cells an RNA which encodes 
a desurable protein or peptide, for example, a therapeutic protein or peptide. 

TTie present invention is carried out using a recombinant alphavirus 
to introduce a heterologous RNA into bone marrow cells. Any alphavirus that 
targets and infects bone manow cells is suitable. Preferred alphaviruses include 
Old World alphaviruses. more preferably SF group alphaviruses and SIN group 
alphaviruses. more preferably Sindbis virus strains (e.g.. TR339), S.A.AR«6 
virus. GirdwoodS.A. virus, and Ockelbo virus. In a more preferred embodiment, 
the alphavirus contains one or more attenuatmg mutations, as described 
hereinabove. 

Two types of recombinant virus vector are contemplated in carrying 
out the present invention. In one embodiment employing "double promoter 
vectors." the heterologous RNA is inserted into a repUcation and propagation 
competent virus. Double promoter vectors are described in United States Patent 
No, 5,505.947 to Johnston et al. With this type of vkal vector, it is preferable 
that heterologous RNA sequences of less than 3 kUobases are inserted into the viral 
vector, more preferably those less than 2 kilobases. and more preferably still those 
less than 1 Idlobase. In an alternate embodiment, propagation-defective "replicon 
vectors," as described in copending United States application 08/448,630 to 
Johnston et al. , will be used. One advantage of replicon viral vectors is that larger 
RNA inserts, up to approximately 4-5 kilobases in length can be utilized. Double 
promoter vectors and replicon vectors are described in more detail hereinbelow. 
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The recombinant alphaviruses of the claimed method target the 
heterologous RNA to bone marrow cells, where it expresses the encoded protein or 
peptide. Heterologous RNA can be introduced and expressed in any cell type found in the 
bone marrow. Bone marrow cells that may be targeted by the recombinant alphaviruses of 
5 the present invention include, but are not limited to, polymorphonuclear ceUs. hemopoietic 
stem ceUs (including megakaryocyte colony forming units (CFU-M), spleen colony 
forming units (CFU-S), erythroid colony forming units (CFU-E), eiythroid burst forming 
units (BFU-E), and colony forming units in culture (CFU-C). erythrocytes, macrophages 
(including reticular cells), monocytes, granulocytes, megakaryoctyes, lymphocytes. 
10 fibroblasts, osteoprogenitor cells, osteoblasts, osteoclasts, marrow stromal cells, 
chondrocytes and other cells of synovial joints. Preferably, marrow cells within the 
endosteum are targeted, more preferably osteoblasts. Also preferred are methods in which 
cells in the endosteum of synovial joints (e.g. , hip and knee joints) are targeted. 

By targeting to the cells of the bone marrow, it is meant that the primary 
15 site in which the virus wiU be localized in vivo is the cells of the bone marrow. 
Alternately stated, the alphaviruses of the present invention target bone marrow ceUs, such 
that titers in bone marrow two days after infection are greater than 100 PFU/g crushed 
bone, preferably greater than 200 PFU/g crushed bone, more preferably greater than 300 
PFU/g crushed bone, and more preferably stiU greater than 500 PFU/g crushed bone. 
20 Virus may be detected occasionally in other cell or tissue types, but only sporadically and 
usually at low levels. Virus localization in the bone marrow can be demonstrated by any 
suitable technique known in the art, such as in situ hybridization. 

Bone marrow cells are long-lived and harbor infectious alphaviruses for a 
prolonged period of time, as demonstrated in the Examples below. These characteristics 
25 of bone marrow cells render the present invention useful not only for the purpose of 
supplying a desired protein or peptide to skeletal tissue, but also for expressing proteins or 
peptides in vivo that are needed by other cell or tissue types. 

The present invention can be carried out in vivo or with cultured bone 
marrow cells in vitro. Bone marrow cell culdires include primary cultures 
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of bone marrow cells, serially-passaged cultures of bone marrow cells, and 

cultures of immortalized bone marrow cell lines. Bone marrow cells may be 
cultured by any suitable means Imown in the art. 



The recombinant alphaviruses of the present invention cany a 
5 heterologous RNA segment. The heterologous RNA segment encodes a promoter 

and an inserted heterologous RNA. The inserted heterologous RNA may encode 
any protein or a peptide which is desirably expressed by the host bone marrow 
cells. Suitable heterologous RNA may be of prokaryotic {e^g. , RNA encoding the 
Botulinus toxin C), or eukaryotic {e.g., RNA encoding malaria Plasmodium 

10 protein csl) origin. Illustrative proteins and peptides encoded by the heterologous 

RNAs of the present invention include hormones, growth factors, interleukins, 
cytokines, chemokines, enzymes, and ribozymes. Alternately, the heterologous 
RNAs encode any therapeutic protein or peptide. As a further alternative, the 
heterologous RNAs of the present invention encode any immunogenic protein or 

15 peptide. 



An immunogenic protein or peptide, or "inmixmogen," may be any 
protein or peptide suitable for protecting the subject against a disease, including 
but not limited to microbial, bacterial, protozoal, parasitic, and viral diseases. For 
example, the immunogen may be an orthomyxovirus -immunogen {e.g., an 

20 influenza virus immimogen, such as the influenza virus hemagglutinin (HA) 

surface protein or the influenza virus nucleoprotein gene, or an equine influenza 
vims immunogen), or a lentivirus immunogen {e.g., an equine infectious anemia 
virus inmiunogen, a Simian Immunodeficiency Virus (SIV) immunogen, or a 
Human Immunodeficiency Virus (HTV) immunogen, such as the HTV envelope 

25 'GP160 protein and the HTV matrix/capsid proteins). The immunogen may also be 
an arenavirus inununogen {e.g., Lassa fever virus immimogen, such as the Lassa 
fever virus nucleocapsid protein gene and the Lassa fever envelope glycoprotein 
gene), a poxvirus immimogen {e.g., vaccinia), a flavivirus immunogen {e.g., a 
yellow fever vims immunogen or a Japanese encephalitis virus immunogen), a 

30 filovirus immunogen {e.g., an Ebola virus immunogen, or a Marburg virus 
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immunogen), a bunyavirus immunogen {e.g., RVFV, CCHF, and SFS viiuses). 
or a coronavirus immunogen (e.g. , an infectious human coronavirus immunogen, 
such as the human coronavirus envelope glycoprotein gene, or a transmissible 
gastroenteritis virus immunogen for pigs, or an infectious bronchitis virus 
immunogen for chickens). 

Alternatively, the present invention can be used to express 
heterologous RNAs encodmg antisense oUgonucleotides. In general, "antisense" 
refers to the use of small, synthetic oligonucleotides to inhibit gene expression by 
inhibiting the function of the target mRNA containing the complementary 
sequence. Milligan, J.F. etal., 7. Med. Chem. 36(14), 1923-1937 (1993). Gene 
expression is inhibited through hybridization to coding (sense) sequences in a 
specific mRNA target by hydrogen bonding according to Watson-Crick base 
pairing rules. The mechanism of antisense inhibition is that the exogenously 
applied oligonucleotides decrease the mRNA and protein levels of the target gene. 
MilUgan, J.F. et al., J. Med. Chem. 36(14), 1923-1937 (1993). See also Helene, 
C. and Toulme, J., Biochim. Biophys. Acta 1049, 99-125 (1990); Cohen, J.S., 
Ed., OLIGODEOXYNUCLEOTIDES AS ANTISENSE INHIBITORS OF GENE 
EXPRESSION, CRC Press:Boca Raton, FL (1987). 

Antisense oligonucleotides may be of any suitable length, depending 
on the particular target being bound. The only limits on the length of the antisense 
oligonucleotide is the capacity of the virus for inserted heterologous RNA. 
Antisense oligonucleotides may be complementary to the entire mRNA transcript 
of the target gene or only a portion thereof. Preferably the antisense 
oligonucleotide is directed to an mRNA region containmg a junction between 
intron and exon. Where the antisense oligonucleotide is directed to an intron/exon 
junction, it may either entirely overlie the jimction or may be sufficiently close to 
the junction to inhibit splicing out of the intervening exon during processing of 
precursor mRNA to mature mRNA {e.g., with the 3* or 5* terminus of the 
antisense oligonucleotide being positioned within about, for example, 10, 5, 3 or 
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2 nucleotides of the intron/exon junction). Also preferred are antisense 
oligonucleotides which overlap the initiation codon. 

When practicing the present invention, the antisense oligonucleotides 
administered may be related in origin to the species to which it is administered. 
When treating himians, human antisense may be used if desired. 

Promoters for use in carrying out the present invention are operable 
in bone marrow cells. An operable promoter in bone marrow cells is a promoter 
that is recognized by and functions in bone marrow cells. Promoters for use with 
the present invention must also be operatively associated with the heterologous 
RNA to be expressed in the bone marrow. A promoter is operably linked to a 
heterologous RNA if it controls the transcription of the heterologous RNA, where 
the heterologous RNA comprises a coding sequence. Suitable promoters are well 
known in the art. The Sindbis 26S promoter is preferred when the alphavirus is 
a strain of Sindbis virus. Additional preferred promoters beyond the Sindbis 26S 
promoter include the Girdwood S.A. 268 promoter when the alphavirus is 
Girdwood S.A., the S.A.AR86 26S promoter when the alphavirus is S.A.AR86, 
and any other promoter sequence recognized by alphavirus polymerases. 
Alphavirus promoter sequences containing mutations which alter the activity level 
of the promoter (in relation to the activity level of the wild-type) are also suitable 
in the practice of the present invention. Such mutant promoter sequences are 
described in Raju and Huang, /. ViroL 65, 2501-2510 (1991), the disclosure of 
which is incorporated in its entirety by reference. 

The heterologous RNA is introduced into the bone marrow cells by 
contacting the recombinant alphavims carrying the heterologous RNA segment to 
the bone marrow cells. By contacting, it is meant bringing the recombinant 
alphavirus and the bone marrow cells in physical proximity. The contacting step 
can be performed in vitro or in vivo. In vitro contacting can be carried out with 
cultures of immortalized or non-inunortalized bone marrow cells. In one particular 
embodiment, bone marrow cells can be removed from a subject, cultured in vitro. 
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infected with the vector, and then introduced back into the subject. Contacting is 
performed in vivo when the recombinant alphavirus is administered to a subject. 
Pharmaceutical formulations of recombinant alphavirus can be administered to a 
subject parenteraUy (e.g.. subcutaneous, intracerebral, intradermal, intramuscular, 
intravenous and intraarticular) administration. Alternatively, pharmaceutical 
formulations of the present invention may be suitable for administration to the 
mucus membranes of a subject (e.g., intranasal administration, by use of a 
dropper, swab, or inhaler). Methods of preparing infectious virus particles and 
pharmaceutical formulations thereof are discussed in more detail hereinbelow. 



10 



15 



By "introducing- the heterologous RNA segment into the bone 
marrow cells it is meant infecting the bone marrow cells with recombinant 
alphavirus containing the heterologous RNA. such that the viral vector carrying the 
heterologous RNA enters the bone marrow cells and can be expressed therein. As 
used with respect to the present invention, when the heterologous RNA is 
"expressed." it is meant that the heterologous RNA is transcribed. In particular 
embodiments of the invention in which it is desired to produce a protein or 
peptide, expression further includes the steps of post-transcriptional processing and 
translation of the mRNA transcribed from the heterologous RNA. In contrast, 
where the heterologous RNA encodes an antisense oligonucleotide, expression need 
20 not include post-transcriptional processmg and translation. With respect to 
embodiments in which the heterologous RNA encodes an immunogenic protein or 
a protein being administered for therapeutic purposes, expression may also include 
the further step of post-translational processing to produce an immunogenic or 
therapeutically-active protein. 



^ The present invention also provides infectious RNAs, as described 

hereinabove, and cDNAs encoding the same. Preferably the infectious RNAs and 
cDNAs are derived from the S.A.AR86. Girdwood S.A.. TR339, or Ockelbo 
viruses. The cDNA clones can be generated by any of a variety of suitable 
methods known to those skiUed in the art. A preferred method is the mefliod set 
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forth in United States Patent No. 5,185.440 to Davis et al., the disclosure of which 
is incorporated in its entirety by reference, and Gubler et al.. Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after introduction of the cDNA. 



A. Double Promoter Vprff^^g 

In one embodiment of the invention, double promoter vectors are 
used to introduce the heterologous RNA into the target bone marrow cells. A 
double promoter vmis vector is a replication and propagation competent virus. 
Double promoter vectors are described in United States Patent No. 5,505,947 to 
Johnston et al., the disclosure of which is incorporated in its entirety by reference. 
Preferred alphaviruses for constructmg the double promoter vectors are S. A. AR86, 
Girdwood S.A.. TR339 and Ockelbo viruses. More preferably, the double 
promoter vector contains one or more attenuating mutations. Attenuating 
mutations are described in more detail hereinabove. 



The double promoter vector is constructed so as to contain a second 
subgenomic promoter (i.e., 26S promoter) inserted 3' to the virus RNA encoding 
the structural proteins. The heterologous RNA is inserted between the second 
subgenomic promoter, so as to be operatively associated therewith, and the 3' 
UTR of the virus genome. Heterologous RNA sequences of less than 3 kUobases. 
more preferably those less than 2 kilobases, and more preferably still those less 
than 1 kUobase, can be inserted into the double promoter vector. In a preferred 
embodiment of the invention, the double promoter vector is derived from 
Girdwood S.A., and the second subgenomic promoter is a duplicate of the 
Girdwood S.A. subgenomic promoter. In an alternate preferred embodhnent, the 
double promoter vector is derived from TR339, and the second subgenomic 
promoter is a duplicate of the TR339 subgenomic promoter. 
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B. Replicon Vectors, 

Replicon vectors, which are propagation-defective virus vectors can 
also be used to carry out the present invention. Replicon vectors are described in 
more detail in copending United States Application 08/448,630 to Johnston et al.. 
5 the disclosure of which is incorporated in its entirety by reference. Preferred 
alphaviruses for constructmg the replicon vectors are S.A.AR86, Girdwood S,A., 
TR339, and Ockelbo. 

In general, in the replicon system, a foreign gene to be expressed 
is inserted in place of at least one of the viral structural protein genes in a 

10 transcription plasmid containing an otherwise full-length cDNA copy of the 

alphavmis genome RNA. RNA transcribed from this plasmid contains an intact 
copy of the viral nonstructural genes which are responsible for RNA replication 
and transcription. Thus, if the transcribed RNA is transfected into susceptible 
cells, it will be replicated and translated to give the nonstructural proteins. These 

15 proteins will transcribe the transfected RNA to give high levels of subgenomic 

mRNA, which will then be translated to produce high levels of the foreign protein. 
The autonomously replicating RNA (Le, , replicon) can only be packaged into virus 
particles if the alphavims structural protein genes are provided on one or more 
"helper" RNAs, which are cotransfected into cells along with the replicon RNA. 

20 The helper RNAs do not contain the viral nonstrucmral genes for replication, but 

these functions are provided in trans by the replicon RNA. Similarly, the 
transcriptase functions translated from the replicon RNA transcribe" the structural 
protein genes on the helper RNA, resulting ui the synthesis of viral structural 
proteins and packaging of the replicon into vims-like particles. As the packaging 

25 or encapsidation signal for alphavirus RNAs is located within the nonstructural 

genes, the absence of these sequences in the helper RNAs precludes their 
incorporation into virus particles. 

Alphaviras-permissive cells employed in the methods of the present 
invention are cells which, upon transfection with the viral RNA transcript, are 
30 capable of producing viral particles. Preferred alphavims-permissive cells are 
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TR339-pennissiveceUs, Girdwood S.A.-pennissive cells, S.A.AR86-permissivecens. and 
Ockelbo-pennissive cells. Alphavinises have a broad host range. Examples of suitable 
host ceUs include, but are not limited to Vero cells, baby hamster kidney (BHK) cells, and 
chicken embryo fibroblast cells. 



5 The phrase "structural protein" as used herein refers to the encoded 

proteins which are required for cncapsidation {e.g., packaging) of the RNA repUcon, and 
include the capsid protein. El glycoprotein, and E2 glycoprotein. As described 
hereinabove, the structural proteins of the alphavirus are distributed among one or more 
helper RNAs (i.e., a first helper RNA and a second helper RNA). In addition, one or 

10 more strucmral proteins may be located on the same RNA molecule as the repUcon RNA, 
provided that at least one structural protein is deleted from the replicon RNA such that the 
resulting alphavirus particle is propagation defective. As used herein, the terms "deleted" 
or "deletion" mean either total deletion of the specified segment or the deletion of a 
sufiBcient portion of the specified segment to render the segment inoperative or 

15 nonfunctional, in accordance with standard usage. See, e.g., U.S. Patent No. 4,650,764 
to Temin et al. The term "propagation defective" as used herein, means that the repUcon 
RNA cannot be encapsidated in the host cell in the absence of the helper RNA. The 
resulting alphavirus repUcon particles are propagation defective inasmuch as the repUcon 
RNA in these particles does not include all of the alphavirus structural proteins required 

20 for encapsidation, at least one of the required structural proteins being deleted thereficom, 
such that the repUcon RNA initiates only an abortive infection; no new viral particles are 
produced, and there is no spread of the infection to other cells. 

The helper cell for e;q)ressing the infectious, propagation defective alphavirus 
particle con^rises a set of RNAs, as described above. The set of RNAs princq)ally 
25 include a first helper RNA and a second helper RNA. The first helper RNA includes 
RNA encoding at least one alphavirus structural protein but does not encode aU alphavirus 
structural proteins. In other words, the first helper RNA does not encode at least one 
alphavirus strucmral protein; the at least one non-coded alphavirus structural protein being 
deleted from the first helper RNA. 
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In one embodiment, the first helper RNA includes RNA encoding the alphavirus 
El glycoprotein, with the alphavirus capsid protein and the alphavirus E2 
glycoprotein being deleted from the first helper RNA. In another embodiment, the 
first helper RNA includes RNA encoding the alphavirus E2 glycoprotein, with the 
alphavirus capsid protein and the alphavirus El glycoprotein being deleted ftom 
the first helper RNA. In a third, preferred embodhnent. the first helper RNA 
includes RNA encoding the alphavirus El glycoprotein and the alphavirus E2 
glycoprotein, with the alphavirus capsid protem being deleted from the first helper 
RNA. 

The second helper RNA includes RNA encoding at least one 
alphavirus structural protein which is different from the at least one structural 
protein encoded by the first helper RNA. Thus, the second helper RNA encodes 
at least one alphavirus structural protein which is not encoded by the first helper 
RNA. The second helper RNA does not encode the at least one alphavirus 
structural protein which is encoded by the first helper RNA. thus the first and 
second helper RNAs do not encode duplicate structural proteins. In the 
embodiment wherein the first helper RNA includes RNA encoding only the 
alphavirus El glycoprotein, the second helper RNA may include RNA encoding 
one or both of the alphavirus capsid protein and the alphavirus E2 glycoprotein 
which are deleted from the first helper RNA. In the embodiment wherein, the first 
helper RNA includes RNA encoding only the alphavirus E2 glycoprotein, the 
second helper RNA may include RNA encoding , one or both of the alphavirus 
capsid protein and the alphavirus El glycoprotein which are deleted from the first 
helper RNA. In the embodiment wherein the first helper RNA includes RNA 
encoding both the alphavinis El glycoprotein and the alphavirus E2 glycoprotein, 
the second helper RNA may include RNA encoding the alphavirus capsid protein 
which is deleted from the first helper RNA. 

In one embodiment, the packaging segment (RNA comprismg the 
encapsidation or packaging signal) is deleted from at least the first helper RNA. 
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In a preferred embodiment, the packaging segment is deleted from both the first 
helper RNA and the second helper RNA. 

In the preferred embodiment wherein the packaging segment is 
deleted from both the first helper RNA and the second helper RNA. the helper ceU 
is co-transfected with a replicon RNA in addition to the first helper RNA and the 
second helper RNA. TTie replicon RNA encodes the packaging segment and an 
inserted heterologous RNA. The inserted heterologous RNA may be RNA 
encoding a protein or a peptide. In a preferred embodiment, the replicon RNA, 
the first helper RNA and the second helper RNA are provided on separat«i 
molecules such that a first molecule. Le., the repUcon RNA. includes RNA 
encoding the packaging segment and the inserted heterologous RNA, a second 
molecule, Le. , the first helper RNA, includes RNA encoding at least one but not 
all of the required alphavirus structural proteins, and a thmi molecule, i.e., the 
second helper RNA. includes RNA encoding at least one but not all of the required 
alphavirus structural proteins. For example, in one preferred embodiment of the 
present invention, the helper cell includes a set of RNAs which include (a) a 
repUcon RNA including RNA encoding an alphavmis packaging sequence and an 
inserted heterologous RNA. (b) a first helper RNA including RNA encoding the 
alphavirus El glycoprotein and the alphavirus E2 glycoprotein, and (c) a second 
helper RNA including RNA encoding the alphavirus capsid protein so that the 
alphavirus El glycoprotein, the alphavirus E2 glycoprotein and the capsid protem 
assemble together into alphavkus particles in tiie host cell. 

In an alternate embodiment, the replicon RNA and the first helper 
RNA are on separate molecules, and tiie replicon RNA and RNA encoding a 
structural gene not encoded by tiie first helper RNA are on anotiier single molecule 
togetiier. such tiiat a first molecule, /.c, tiie first helper RNA. including RNA 
encoding at least one but not all of tiie requked alphavirus structural proteins, and 
a second molecule. Le. , tiie replicon RNA, mcluding RNA encoding tiie packaging 
segment, tiie inserted heterologous RNA. and tiie remaining structural proteins not 
encoded by tiie first helper RNA. For example, in one preferred embodiment of 
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the present invention, the helper cell includes a set of RNAs including (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence, an 
inserted heterologous RNA, and an alphavirus capsid protein, and (b) a fust helper 
RNA including RNA encoding the alphavirus El glycoprotein and the alphavirus 
E2 glycoprotein so that the alphavirus El glycoprotein, the alphavirus E2 
glycoprotein and the capsid protein assemble together into alphavirus particles in 
the host cell, with the replicon RNA packaged therein. 

In one preferred embodiment of the present invention, the RNA 
encoding the alphavirus structural proteins, i.e., the capsid. El glycoprotein and 
E2 glycoprotein, contains at least one attenuating mutation, as described 
hereinabove. Thus, according to this embodiment, at least one of the first helper 
RNA and the second helper RNA includes at least one attenuating mutation. In 
a more preferred embodiment, at least one of the first helper RNA and the second 
helper RNA includes at least two, or multiple, attenuating mutations. The multiple 
attenuating mutations may be positioned in either the first helper RNA or in the 
second helper RNA, or they may be distributed randomly with one or more 
attenuating mutations being positioned in the first helper RNA and one or more 
attenuating mutations positioned in the second helper RNA. Alternatively, when 
die replicon RNA and the RNA encodmg the structural proteins not encoded by 
the first helper RNA are located on the same molecule, an attenuating mutation 
may be positioned in the RNA which codes for the structural protein not encoded 
by the first helper RNA. The attenuating mutations may also be located within the 
RNA encoding non-structural proteins {e.g. , the replicon RNA). 

Preferably, the first helper RNA and the second helper RNA also 
include a promoter. It is also preferred that the replicon RNA also includes a 
promoter. Suitable promoters for inclusion in the first helper RNA, second helper 
RNA and replicon RNA are well known in the art. One preferred promoter is the 
Girdwood S.A. 26S promoter for use when the alphavirus is Girdwood S.A. 
Another preferred promoter is the TR339 26S promoter for use when the 
alphavirus is TR339. Additional promoters beyond the Girdwood S.A. and TR339 
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promoters include the VEE 26S promoter, the Sindbis 26S promoter, the Semliki 
Forest 26S promoter, and any other promoter sequence recognized by alphavirus 
polymerases. Alphavirus promoter sequences containing mutations which alter the 
activity level of the promoter (in relation to the activity level of the wild-type) are 
also suitable in the practice of the present invention. Such mutant promoter 
sequences are described in Raju and Huang, J. Virol. 65, 2501-2510 (1991). the 
disclosure of which is incorporated herein in its entirety. In the system wherein 
the first helper RNA, the second helper RNA, and the repUcon RNA are all on 
separate molecules, the promoters, if the same promoter is used for aU three 
RNAs, provide a homologous sequence between the three molecules. It is 
preferred that the selected promoter is operative with the non-structural proteins 
encoded by the rcplicon RNA molecule. 

In cases where vaccination with two immunogens provides improved 
protection against disease as compared to vaccmation with only a single 
immunogen, a double-promoter repUcon would ensure that both immunogens are 
produced in the same ceU. Such a replicon would be the same as the one 
described above, except that it would contain two copies of the 26S RNA 
promoter, each followed by a different multiple cloning site, to allow for the 
insertion and expression of two different heterologous proteins. Another useful 
strategy is to insert the IRES sequence from the picomavirus, EMC vkus, between 
the two heterologous genes downstream from the single 26S promoter of the 
replicon described above, thus leading to expression of two immunogens from the 
single replicon transcript in the same cell. 

C. Uses of thP Pr esent Invenfinn 

The alphavirus vectors, RNAs, cDNAs, helper cells, infectious virus 
particles, and methods of the present invention find use in in vftro expression 
systems, wherein the inserted heterologous RNA encodes a protein or peptide 
which is desirably produced in vitro. The RNAs. cDNAs, helper ceUs. infectious 
virus particles, methods, and pharmaceutical formulations of the present invention 
are additionaUy useful in a method of administering a protein or peptide to a 
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subject in need of the protein or peptide, as a method of treatment or otherwise. 
In this embodiment of the invention, the heterologous RNA encodes the desired 
protein or peptide, and pharmaceutical formulations of the present invention are 
administered to a subject in need of the desired protein or peptide. In this manner. 
5 the protein or peptide may thus be produced in vivo in the subject. The subject 

may be in need of the protein or peptide because the subject has a deficiency 
thereof, or because the production of the protem or peptide in the subject may 
unpart some therapeutic effect, as a method of treatment or otherwise. 

Alternately, the claimed methods provide a vaccination strategy, 
10 wherein the heterologous RNA encodes an immunogenic protein or peptide. 

The methods and products of the invention are also useful as 
antigens and for evoking the production of antibodies in animals such as horses 
and rabbits, from which the antibodies may be collected and then used in 
diagnostic assays in accordance with known techniques. 

^5 A further aspect of the present invention is a method of introducing 

and expressing antisense oligonucleotides in bone marrow cell culmres to regulate 
gene expression. Alternately, the claimed method finds use in introducing and 
expressing a protein or peptide in bone marrow cell culmres. 

n. Girdw ood S.A. and TR339 Clones. 
20 Disclosed hereinbelow are genomic RNA sequences encoding live 

Girdwood S.A. virus, live S.A.AR86 virus, and live Smdbis strain TR339 virus, 
cDNAs derived therefrom, infectious RNA transcripts encoded by the cDNAs, 
infectious viral particles containing the infectious RNA transcripts, and 
pharmaceutical formulations derived there&om. 

25 The cDNA sequence of Gu-dwood S.A. is given herein as SEQ ID 

NO:4, Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:4. but which has the same protein sequence as the cDNA 
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Thus, the cDNA may include one or more silent 



The phrase "silent mutation" as used herein refers to mutations in 
the cDNA coding sequence which do not produce mutations in the corresponding 
protein sequence translated therefrom. 

Likewise, the cDNA sequence of TR339 is given herein as SEQ ID 
NO:8. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:8. but Which has the same protein sequence as the cDNA 
given herein as SEQ ID NO:8. Thus, the cDNA may include one or more silent 
mutations. 



The cDNAs encoding infectious Girdwood S.A. and TR339 virus 
RNA transcripts of the present invention include those homologous to, and having 
essentiaUy the same biological properties as, the cDNA sequences disclosed herein 
as SEQ ID NO:4 and SEQ ID NO:8, respectively. Thus, cDNAs that hybridize 
to cDNAs encoding infectious Girdwood S.A. or TR339 virus RNA transcripts 
disclosed herein are also an aspect of this invention. Conditions which wiU permit 
other cDNAs encoding infectious Girdwood S.A. or TR339 virus transcripts to 
hybridize to the cDNAs disclosed herein can be determined in accordance with 
Imown techniques. For example, hybridization of such sequences may be carried 
out under conditions of reduced stringency, medium stringency, or even high 
stringency conditions (e.g. , conditions represented by a wash stringency of 35-40% 
foimamide with 5X Denhardt's solution, 0.5% SDS and IX SSPE at 37''C; 
conditions represented by a wash stringency of 40-45% formamide with 5X 
Denhardt's solution, 0.5 % SDS, and IX SSPE at 42°C; and conditions represented 
by a wash stringency of 50% formamide with 5X Denhardt's solution, 0.5% SDS 
and IX SSPE at 42»C, respectively, to cDNA encoding infectious Girdwood S.A. 
or 111339 virus RNA transcripts disclosed herein in a standard hybridization assay. 
See J. SAMBROOK ET AL.. MOLECULAR CLONING: A LABORATORY 
MANUAL (2d ed. 1989)). In general, cDNA sequences encoding infectious 
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Girdwood S.A. or TR339 virus RNA transcripts that hybridize to the cDNAs 
disclosed herein wUl be at least 30% homologous, 50% homologous, 75% 
homologous, and even 95% homologous or more with the cDNA sequences 
encoding infectious Girdwood S.A. or TR339 virus RNA transcripts disclosed 
herein. 



Promoter sequences and Girdwood S.A. virus or Sindbis virus strain 
TR339 cDNA clones are operatively associated in the present invention such that 
the promoter causes the cDNA clone to be transcribed in the presence of an RNA 
polymerase which binds to the promoter. The promoter is positioned on the 5' end 
(with respect to the virion RNA sequence), of the cDNA clone. An excessive 
number of nucleotides between the promoter sequence and the cDNA clone will 
result in the inoperabUity of the constract. Hence, the number of nucleotides 
between the promoter sequence and the cDNA clone is preferably not more than 
eight, more preferably not more than five, still more preferably not more than 
three, and most preferably not more than one. 

Exanq)les of promoters which are useful in the cDNA sequences of 
the present invention include, but are not limited to T3 promoters, T7 promoters, 
cytomegalovirus (CMV) promoters, and SP6 promoters. The DNA sequence of 
the present invention may reside in any suitable transcription vector. The DNA 
sequence preferably has a complementary DNA sequence bound thereto so that the 
double-stranded sequence will serve as an active template for RNA polymerase. 
The transcription vector preferably comprises a plasmid. When the DNA sequence 
comprises a plasmid, it is preferred that a unique restriction site be provided 3' 
(with respect to the virion RNA sequence) to the cDNA clone. This provides a 
means for linearizing the DNA sequence to allow the transcription of genome- 
length RNA in vitro. 



The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
forth in United States Patent No. 5,185,440 to Davis et al., the disclosure of which 
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is incorporated in its entirety by reference, and Gubler et al. , Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after uitroduction of the cDNA. 

The Girdwood S.A. and TR339 cDNA clones and the infectious 
RNAs and infectious virus particles produced therefrom of the present invention 
are useful for the preparation of pharmaceutical formulations, such as vaccines. 
In addition, the cDNA clones, infectious RNAs. and infectious viral particles of 
the present invention are useful for administtation to animals for the purpose of 
producing antibodies to the Girdwood S.A. virus or the Sindbis virus strain 
TR339. which antibodies may be collected and used in known diagnostic 
techniques for the detection of Gurdwood S.A. virus or Sindbis virus strain TR339. 
Antibodies can also be generated to the viral proteins expressed from the cDNAs 
disclosed herein. As another aspect of the present invention, the claimed cDNA 
clones are useful as nucleotide probes to detect the presence of Girdwood S.A. or 
TR339 genomic RNA or transcripts. 

m. Infections Virus P^^^ticl« anri Ph^miacenfir^l 

Hie infectious virus particles of the present invention include those 
containing double promoter vectors and those containing repUcon vectors as 
described hereinabove. Alternately, the infectious virus particles contain infectious 
RNAs encoding the Girdwood S.A. or TR339 genome. When the infectious RNA 
comprises the Girdwood S.A. genome, preferably the RNA has the sequence 
encoded by the cDNA given as SEQ ID NO:4. When the infectious RNA 
comprises the TR339 genome, preferably the RNA has the sequence encoded by 
the cDNA given as SEQ ID NO:8. 

Tbe infectious, alphavirus particles of the present invention may be 
prepared according to the methods disclosed herem in combination with techniques 
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known to those skilled in the art. These methods include transfecting an 
alphavirus-pennissive cell with a replicon RNA including the alphavirus packaging 
segment and an inserted heterologous RNA, a first helper RNA including RNA 
encoding at least one alphavirus strucmral protein, and a second helper RNA 
including RNA encoding at least one alphavirus strucmral protein which is 
different from that encoded by the first helper RNA. Alternately, and preferably, 
at least one of the helper RNAs is produced from a cDNA encoding the helper 
RNA and operably associated with an appropriate promoter, the cDNA being 
stably transfected and integrated into the ceUs. More preferably, all of the helper 
RNAs wiU be "launched" from stably transfected cDNAs. The step of transfecting 
the alphavirus-permissive cell can be carried out according to any suitable means 
known to those skUled in the art, as described above with respect to propagation- 
competent viruses. 



Uptake of propagation-competent RNA into the cells in vitro can be 
carried out according to any suitable means known to those skilled in the art. 
Uptake of RNA into the ceUs can be achieved, for example, by treating the cells 
with DEAE-dextran. treating the RNA with UPOFECTIN* before addition to the 
cells, or by electroporation. with electroporation being the currently preferred 
means. These techniques are well known in the art. See e.g.. United States 
Patent No. 5.185,440 to Davis et al., and PCT Publication No. WO 92/10578 to 
Bioption AB. the disclosures of which are incorporated herein by reference in their 
entirety. Uptake of propagation-competent RNA into the cell in vivo can be 
carried out by administering the infectious RNA to a subject as described in 
Section I above. 



The infectious RNAs may also contain a heterologous RNA 
segment, where the heterologous RNA segment contains a heterologous RNA and 
a promoter operably associated therewith. It is preferred that the infectious RNA 
introduces and expresses the heterologous RNA in bone marrow ceUs as described 
in Section I above. According to this embodiment, it is preferable that the 
promoter operatively associated with the heterologous RNA is operable in bone 
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marrow cells. The heterologous RNA may encode any protein or peptide, 
preferably an immunogenic protein or peptide, a therapeutic protem or peptide, a 
hormone, a growth factor, an interleukin, a cytokine, a chemokine, an enzyme, 
a ribozyme, or an antisense oUgonucleotide as described in more detail in Section 
I above. 



The step of facUitating the production of the infectious viral particles 
in the ceUs may be carried out using conventional techniques. See e.g.. United 
States Patent No. 5,185,440 to Davis et al., PCT Publication No. WO 92/10578 
to Bioption AB, and United States Patent No. 4,650,764 to Temin et al. (although 
Temin et al. , relates to retroviruses rather than alphaviruses). The infectious viral 
particles may be produced by standard cell culmre growth techniques. 

The step of collecting the infectious virus particles may also be 
carried out using conventional techniques. For example, the infectious particles 
may be coUected by ceU lysis, or coUection of the supernatant of the ceU culture, 
as is known in the art. See e.g. , United States Patent No. 5,185.440 to Davis et 
al.. PCT PubUcation No. WO 92/10578 to Bioption AB, and United States Patent 
No. 4,650,764 to Temin et al. Other suitable techniques will be known to those 
skilled in the art. OptionaUy, the coUected infectious virus particles may be 
purified if desired. Suitable purification techniques are well known to those skiUed 
in the art. 



Pharmaceutical formulations, such as vaccines, of the present 
invention comprise an immunogenic amount of the infectious, virus particles in 
combination with a pharmaceutically acceptable carrier. An "immunogenic 
amount" is an amoxmt of the infectious virus particles which is sufficient to evoke 
an immune response in the subject to which the pharmaceutical formulation is 
administered. An amount of from about 10^ to about 10' particles, and preferably 
about 10* to 10« particles per dose is believed suitable, depending upon the age and 
species of the subject being treated, and the immunogen against which the immune 
response is desired. 
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Pharmaceutical formulations of the present invention for therapeutic 
use comprise a therapeutic amount of the infectious virus particles in combination 
with a pharmaceutically acceptable carrier. A "therapeutic amount" is an amount 
of the infectious virus particles which is sufficiem to produce a therapeutic effect 
5 (e.g. , triggering an immune response or supplying a protein to a subject in need 

thereof) in the subject to which the pharmaceutical formulation is administered. 
The therapeutic amount wiU depend upon the age and species of the subject being 
treated, and the therapeutic protein or peptide being administered. Typical dosages 
are an amount from about 10' to about 10* infectious units. 

10 Exemplary pharmaceutically acceptable carriers include, but are not 

limited to. sterile pyrogen-free water and sterile pyrogen-free physiological saline 
solution. Subjects which may be administered immunogenic amounts of the 
infectious vunis particles of the present invention include but are not limited to 
human and anhnal (e.g., pig, cattle, dog, horse, donkey, mouse, hamster, 

15 monkeys) subjects. 



Pharmaceutical formulations of the present invention mclude those 
suitable for parenteral (e.g., subcutaneous, intracerebral, mtradermal, 
intramuscular, mtravenous and intraarticular) administration. Alternatively, 
pharmaceutical formulations of the present invention may be suitable for 
.administration to the mucus membranes of a subject (e. g. , intranasal administration 
by use of a dropper, swab, or mhaler). The formulations may be conveniently 
prepared in unit dosage form and may be prepared by any of the methods well 
known in the art. 



The following examples are provided to illustrate the present 
invention, and should not be construed as limiting thereof. In these examples, 
PBS means phosphate buffered saline, EDTA means ethylene diamine tetraacetate, 
ml means millUiter, /xl means microliter, mM means millimolar, means 
micromolar, u means unit, PFU means plaque forming units, g means gram, mg 
means milligram, ng means microgram, cpm means counts per minute, ic means 
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intracerebral or intraccrebrally, ip means intraperitoneal or intraperitoneally, iv 
means intravenous or intravenously, and sc means subcutaneous or subcutaneously . 

Amino acid sequences disclosed herein arc presented in the amino 
to carboxyl direction, from left to right. The amino and carboxyl groups are not 
5 presented in the sequence. Nucleotide sequences are presented herein by single 

strand only in the 5' to 3' direction, from left to right. Nucleotides and amino 
acids are represented herein in the manner recommended by the lUPAC-IUB 
Biochemical Nomenclature Commission, or (for amino acids) by either one letter 
or three letter code, in accordance with 37 CFR § 1.822 and established usage. 
10 Where one letter amino acid code is used, the same sequence is also presented 

elsewhere in three letter code. 

EXAMPLE I 

Cells and Virus Stnrkif 
S. A.AR86 was isolated in 1954 from a pool of Culex sp. mosquitoes 

15 collected near Johannesburg, South Africa. Weinbren et al., S. Afr. Med. J. 30, 

631-36 (1956). Ockelbo82 was isolated from Culiseta sp. mosquitoes collected in 
Edsbyn, Sweden in 1982 and was associated serologically with himian disease. 
Niklasson et al., Am. J. Trop. Med. Hyg. 33, 1212-17 (1984). Girdwood S.A. 
was isolated from a human patient in the Johannesburg area of South.Africa in 

20 1963. Malherbe et al., S. Afr. Med. J. 37, 547-52 (1963). Molecularly cloned 

virus TR339 represents the deduced consensus sequence of Sindbis AR339. 
McKnight et al.-, /. Virol. 70, 1981-89 (1996); William Klhnstra, personal 
communication. TRSB is a laboratory strain of Sindbis isolate AR339 derived 
from a cDNA clone pTRSB and differing from the AR339 consensus sequence at 

25 three codons. McKnight et al., J. Virol. 70, 1981-89 (1996). pTR5000 is a full- 

length cDNA clone of Sindbis AR339 foUowmg the SP6 phage promoter and 
containing mostly Sindbis AI1339 sequences. 

Stocks of all molecularly cloned vuaises were prepared by 
electroporating genome length in vitro transcripts of their respective cDNA clones 
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in BHK-21 cells. Heidner et al., 7. ViroL 68, 2683-92 (1994). Girdwood S.A. 
(Malherbe et al. , 5. Afr. Med. J. 37, 547-52 (1963)) and Ockelbo82 (Espmark and 
Niklasson, Am. 7. Trop. Med. Hyg. 33, 1203-11 (1984); Niklasson et al., Am. J. 
Trop. Med. Hyg. 33, 1212-17 (1984)) were passed one to three times inBHK-21 
5 cells in order to produce amplified stocks of virus. All vims stocks were 

stored at -70^*0 until needed. The titers of the virus stocks were determined on 
BHK-21 cells ftom aliquots of frozen vims. 

EXAMPLE 2 

Cloning the S.A.AR86 and Girdwood S.A> Genomfr Sequences 
10 The sequences of S.A.AR86 (Figure 1, SEQ ID NO: 1) and 

Gkdwood S.A. (Figure 3, SEQ ID NO:4) were determined from uncloned reverse 
transcriptase-polymerase chain reaction (RT-PCR) fragments amplified from virion 
RNA. Heidner et al., /. ViroL 68, 2683-92 (1994). The sequence of the 5' 40 
nucleotides was determined by directly sequencing the genomic RNA. Sanger et 
15 al., Proc. Natl. Acad. ScL USA 74, 5463-67 (1977); Zimmem and Kaesberg, 

Proc. Natl. Acad. Sci. USA 75, 4257-61 (1978); Ahlquist et al., Cell 23. 183-89 
(1981). 

The S.A.AR86 genome was 11,663 nucleotides in length, excluding 
the 5' CAP and 3'poly(A) tail, 40 nucleotides shorter than the alphavirus prototype 

20 Sindbis strain AR339. Strauss et al.. Virology 133, 92-110 (1984). Compared 

with the consensus sequence of Sindbis vims AR339 (McKnight et al., J. Virol. 
70 1981-89 (1996)), S.A. AR86 contained two separate 6-nucleotide insertions, and 
one 3-nucleotide insertion in the 3' half of the nsP3 gene, a region not well 
conserved among alphaviruses. The two 6-nucleotide insertions were foimd 

25 immediately 3' of nucleotides 5403 and 5450, and the 3-nucleotide insertion was 

immediately 3' of nucleotide 5546 compared with the AR339 genome. In addition, 
S.A.AR86 contained a 54-nucleotide deletion in nsP3 which spanned nucleotides 
5256 to 5311 of AR339. As a result of these deletions and insertions, S.A.AR86 
iisP3 was 13 amino acids smaller than AR339, containing an 18-amino acid 

30 deletion and a total of 5 amino acids inserted. The 3' untranslated region of 
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S.A.AR86 contained, with respect to AR339. two l-nucleotide deletions at 
nucleotides 11,513 and 11,602, and one 1-nucIeotide insertion following nucleotide 
11,664. The total numbers of nucleotides and predicted amino acids conq)rising 
the remaining genes of S.A.AR86 were identical to those of AR339. 

A notable feature of the deduced amino acid sequence of S.A.AR86 
(Elgure 2, SEQ ID NO:2 and SEQ ID NO:3) was the cysteine codon in place of 
an opal termination codon between nsP3 and nsP4. S.A.AR86 is the only 
alphavirus of the Sindbis group, and one of just three alphavirus isolates sequenced 
to date, which do not contain an opal termination codon between nsP3 and nsP4. 
Takkinen, K.. Nucleic Acids Res. 14, 5667-5682 (1986); Strauss et al., Virology 
164, 265-74 (1988). 

The genome of Girdwood S.A. was 11,717 nucleotides long 
excluding the 5' CAP and 3' poly (A) tail. The nucleotide sequence (SEQ ID 
NO;4) of the Girdwood S.A. genome and the puutive amino acid sequence (SEQ 
ID NO:5 and SEQ ID NO:6) of the Girdwood S.A. gene products are shown in 
Figure 3 and Figure 4, respectively. The asterisk at position 1902 in SEQ ID 
NO:5 indicates the position of the opal termination codon m the coding region of 
the nonstrucmral polyprotein. The extra nucleotides relative to AR339 were m the 
noBConserved half of nsP3, which contained insertions totalling 15 nucleotides, and 
in the 3' untranslated region which contained two l-nucleotide deletions and a 1- 
nucleotide insertion with respect to the consensus Sindbis AR339 genome. The 
insertions found in the nsP3 gene of Girdwood S.A. were identical in position and 
content to those found in S.A.AR86, although Girdwood S.A. did not have the 
large nsP3 deletion characteristic of S.A.AR86. The remaining portions of the 
genome contained the same number of nucleotides and predicted amino acids as 
Sindbis AR339. 

Overall, Girdwood S.A. was 94.5% identical to the consensus 
Sindbis AR339 sequence, differing at 655 nucleotides not including the insertions 
and deletions. These nucleotide differences resulted in 88 predicted amino acid 
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changes or a difference of 2.3%. A plurality of amino acid differences were 
concentrated in the nsP3 gene, which contained 32 of the amino acid changes, 25 
of which were in the nonconserved 3' half. 

The Girdwood S.A. nucleotides at positions 1, 3, and 11,717 could 
not be resolved. Because the primer used during the RT-PCR amplification of the 
3' end of the genome assumed a cytosine in the 3' terminal position, the identity 
of this nucleotide could not be determined with certainty. However, in all 
alphaviruses sequenced to date there is a cytosine in this position. This, combined 
with the fact that no difficulty was encountered in obtaining RT-PCR product for 
this region with an oligo(dT) primer ending with a 3'G, suggested that Girdwood 
S.A. also contains a cytosine at this position. The ambiguity at nucleotide 
positions 1 and 3 resulted from strong stops encountered during the RNA 
sequencing. 

EXAMPLES 

15 Comparison of S.A.AR86 and Girdwood S.A. 

Seouences With Other Sindbis-Related Virus Seouencejg 

Table 1 examines the relationship of S.A.AR86 and Girdwood S.A. 
to each other and to other Sindbis-related vuiises. This was accomplished by 
aligning the nucleotide and .deduced amino acid sequences of Ockelbo82, AR339 

20 and Girdwood S.A. to those of S.A.AR86 and then calculating the percentage 

identity for each gene using the programs contained witliin the Wisconsin GCG 
package (Genetics Computer Group, 575 Science Drive, Madison WI 53711); as 
described in more detail in McKnight et aL, /. ViroL 70, 1981-89 (1996). 

The analysis suggests that S.A.AR86 is most similar to the other 

25 South African isolate, Girdwood S.A. , and that the South African isolates are more 

similar to the Swedish Ockelbo82 isolate than to the Egyptian Sindbis AR339 
isolate. These results also suggest that it is unlikely that S.A.AR86 is a 
recombinant virus like WEE vims. Hahn et al., Proc. Natl Acad. Set USA 85, 
5997-6001 (1988). 



5 



10 



SUBSTITUTE SHEET (RULE 26) 



wo 98/36779 



-36- 



PCT/US98/02945 



a 

CO 

CO 

o 
o 

'a 



o 
CO c 

-a « 

'3 ^ 

-< oo 

§ J 

^ o 

•§§ 

S .52 

o -a 

'5 
2 .S 

o 

a 2 
B ^ 

O 

u 



oo 



C>4 

u 
o 



cn 

CO 
CO 

< 



Q 

QC 

CD 



XI 

e 



CO 

C 

o 

£x: 



o 
o 

o 



o 



- d 2 



^ o 

CO GO 



CM 



CO 


o 


o 


lO 


CO 




C3 


d 


d 


d 


d 




d 


I 2 




o 


o 


CM 




CN 


to 



CO O rn 



CO 


* 


o 




CO 




d 




d 




CO 




CM 


CO 


o 


CO 


CM 


CO 



CJ) 


q 




CO 


r— 




si 




in 


CD 


in 


GO 



«o to 
£ £ ~ £2 
- - ii! 



CO 



o 


CO 


CD 








d 




CO 


CO 




CO 










o 




CD 


in 


CO 


CO 




CO 


00 


CO 


00 


CO 



(5.7: 


(5.71 


16.6] 






in 


[8.91 


15.6) 


d 


;3.7) 


4.5) 


CO 


in 


CO 






CO 
CO 






o 


Oi 
rt 





> 



> 

w 
0) 

vi 



CL 
CO 

C 



o o 



C/} 



C/3 



CJ 



C 
CJ 

c 



CO 
CM 



CL 
CO 

u 



CO CM 



CO 



TO 
09 



c 

D 



CO 
CO 



CO 
CM* 



CO 

o 



1^ 


CD 


Gi 


CO 


CM 






(73 








CN 


CO 




1— 


d 




r-» 


CN 






7(0. 






CN 




d 






in 


in 


CO 
1— 


o 


cn 




CO 




CO 




CN 



(0.0) 


(3.3) 


[2.6) 


(3.4) 


[5.4) 


(2.3) 


2.5) 


(3.3) 


o 


CD 
CN 


in 


CO 


cn 


CO 


00 


CO 



in 
in 

O) 
00 
CO 



tu CO 



o 



c 

C3 



c 

a 

c 
ra 

w 
CO 
> 



cn 

X 

oT 

CO 
CO 

< 



•a 
c 



09 
CO 

3 

<D 
CO 

O 

jC 



c 
-a 

w 
O 
U 

u 

CO 

"O 

w 
OS 



00 
CD 



CM 

Oi 

CO 

CO 

0> to 

^ c 

s ^ 

(U 

r -a 



O CD 
CO CO 

CO C 
3 O 

2 r 

c/3 S 

c 

CO 

CO V 

si 



Ji 

CO a> 

CO ^ 
QJ 

u O 



c 



E 

C 

CO 
00 

cc 

< 
< 

CO 
CN 

m 

• CO 

o « 

O O 

CO *^ 

(O ^ 

CO Q 

CO 

«J CD 
^ CM 

Q in 

§ 2 

in 

^§ 

^ no 

CO 

09 O 

O 3 
« C 

3 = 
C o 

J " 

c» -a 

09 09 

»- > 

09 S 

S § 

= i 

• • 
O TO 



SUBSTITUTE SHEET (RULE 26) 



wo 98/36779 PCTAJS98/0294S 
. • ' -37- 

EX.4JVIPLE 4 
Neurovirulenr e of S.A.AR86 and Girdwood S. A. 
Girdwood S. A. , Ockelbo82, and S. A. AR86 are related by sequence; 
in contrast, it has previously been reponed that only S.A.AR86 displayed the adult 
mouse neurovirulence phenotype. Russell et al., J. Virol. 63, 1619-29 (1989). 
These findings were confirmed by the present investigations. Briefiy. groups of 
four female CD-I mice (3-6 weeks of age) were inoculated ic with IC? plaque- 
forming units (PFU) of S.A.AR86. Girdwood S.A.. or OckeIbo82. Neither 
Girdwood S.A. nor Ockelbo82 infection produced any clmical signs of infection. 
Infection with S.A. AR86 produced neurological signs within four to five days and 
ultimately killed 100% of the mice as previously demonstrated. 

Table 2 Usts those amino acids of S.A.AR86 which might explain 
the neurovirulence phenotype in adult mice. A position was scored as potentially 
related to the S.A.AR86 adult neurovirulence phenotype if the S.A.AR86 amino 
acid differed from that which otherwise was absolutely conserved at that position 
in the other viruses. 



TABLE 2 

Divergent Amino Acids in S.A.AR86 
Potentially Related to the Adult Neurovirulence Phenotype 





Position in 


S.A.AR86 


Conserved 




S.A.AR86 


Amino Acid 


Amino Acid 


nsPI 


583 


Thr 


He 


nsP2 


256 


Arg 


Ala 




848 


tie 


Val 




651 


Lys 


Glu 


nsP3 


344 


Gly 


GIu 




386 


Tyr 


Ser 




441 


Asp 


Gly 




445 


lie 


Met 




537 


Cys 


Opal 


E2 


243 


Ser 


Leu 


6K 


30 


Val 


lie 


El 


112 


Val 


Ala 




169 


Leu 


Ser 
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EXAMPLE 5 
pS55 Molecular Clone of S.A.ARRfi 
As a first step in investigating the unique adult mouse 
neurovirulence phenotype of S.A.AR86, a full-length cDNA clone of the 
5 S. A. AR86 genome was constructed. The sources of cDNA included conventional 

cDNA clones (Davis et al.. Virology 171, 189-204 (1989)) as well as uncloned 
RT-PCR fragments derived from the S . A. AR86 genome. As described previously, 
these were substituted, starting at the 3' end, into pTRSOOO (McKnight et al., 7. 
YiroL 70, 1981-89 (1996)), a full-length Sindbis clone from which infectious 
• 10 genomic replicas could be derived by transcription with SP6 polymerase in vitro. 

The end result was pS55, a molecular clone of S.A.AR86 from 
which infectious transcripts could be produced and which contained four nucleotide 
changes (G for A at nt 215; G for C at nt 3863; G for A at nt 5984; and C for T 
at nt 9113) but no amino acid coding differences with respect to the S.A.AR86 
15 genomic RNA (amino acid sequence of S.A. AR86 presented in Figure 2 (SEQ ID 

NO:2 and SEQ ID NO:3)). The nucleotide sequence of clone pS55 is presented 
in Figure 5 (SEQ ID NO:7). 

As has been described by Simpson et al.. Virology 222, 464-69 
(1996), neurovirulence and replication of the virus derived from pS55 (S55) were 

20 compared with those of S. A.AR86. It was found that S55 exhibits the distinctive 

aciult neurovimlence characteristic of S.A.AR86. Like S.A.AR86. 555 produces 
100% mortality in adult mice infected with the virus and the survival times of 
animals infected -with both viruses were indistinguishable. In addition, S55 and 
S.A.AR86 were found to replicate to essentially equivalent titers in vivo, and the 

25 profiles of S55 and S.A,AR86 virus growth in the central nervous system and 

periphery were very similar. 

From these data it was concluded that the silent changes foimd in 
virus derived from clone pS55 had little or no effect on its growth or virulence, 
and that this molecularly cloned virus accurately represents the biological isolate, 
30 S.A.AR86. 
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EXAiMPLE 6 
Construction of the Consensus AR339 Virus TR339 
The consensus sequence of the Sindbis virus AR339 isolate, the 
prototype alphavirus was deduced. The consensus AR339 sequence was inferred 
5 by comparison of the TRSB sequence (a laboratory-derived AR339 strain) with the 

complete or partial sequences of HRjp (the Gen Bank sequence; Strauss et al.. 
Virology 133, 92-110 (1984)), SVIA, and NSV (AR339-derived laboratoiy strains; 
Lustig et ah, /. Virol 62, 2329-36 (1988)), and SIN (a laboratory-derived AR339 
strain; Davis et al.. Virology 161, 101-108 (1987), Strauss et aL, /. ViroL 65, 
10 4654-64 (1991)). Each of these viruses was descended from AR339. Where these 

sequences differed from each other, they also were compared with the amino acid 
sequences of other viruses related to Sindbis vuiis: Ockelbo82, S.A.AR86, 
Girdwood S.A., and the somewhat more distantly related Aura virus. Rumenapf 
et al., Vurology 208, 621-33 (1995). 



15 The details of determining a consensus AR339 sequence and 

constructing the consensus virus TR339 have been described elsewhere. McKnight 
et aL, J. ViroL 70, 1981-89 (1996); Klimstra et al., manuscript in preparation. 
The nucleotide (SEQ ID NO:8) sequence of pTR339 is presented in Figure 6. 
The deduced amino acid sequences of the pTR339 non-structural and structural 

20 polyproteins are shown as SEQ ID NO:9 and SEQ ID NOilO, respectively. The 

asterisk at position 1897 in SEQ ID NO:9 indicates the position of the opal 
termination codon in the coding region of the nonstructural polyprotein. The 
consensus nucleotide sequence diverged from the pTRSB sequence at three coding 
positions (nsP3 :528, E2 1, and El 72). These differences are illustrated in Table 

25 3. 



TABLE 3 

Amino Acid Differences Between 
Laboratory Strain TRSB and Molecular Clone TR339 





nsP3 528 (nt56B3) 


E2 1 (nt8633) 


El 72 (nt10279) 


TR339 


Arg (CGA) 


Ser (AGO 


Ala (GCU) 


TRSB 


Gin (CAA) 


Arg (AGA) 


Val (GUU) 
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EXAMPLE 7 
Animals Used for In Vivo Localization Studies 
Specific pathogen free CD-I mice were obtained from Charles River 
Breeding Laboratories (Raleigh, North Carolina) at 21 days of age and maintained 
5 under barrier conditions until approximately 37 days of age. Intracerebral (ic) 

inoculations were performed as previously described, Simpson et aL, ViroL 222, 
464-49 (1996), with 500 PFU of S51 (an attenuated mutant of S55) or 10^ PFU of 
S55. Animals inoculated peripherally were first anesthetized with METOFANE®. 
Then, 25 p\ of diluent (PBS, pH 7.2, 1% donor calf serum, 100 u/ml penicillin, 

10 50 Aig/ml streptomycin, 0.9 mM CaClj, and 0.5 mM MgCy containing 10^ PFU 

of virus were injected either intravenously (iv) into the tail vein, subcutaneously 
(sc) into the skin above the shoulder blades on the middle of the back, or 
intraperitoneally (ip) in the lower right abdomen. Animals were sacrificed at 
various times post-inoculation as previously described. Simpson et al. , ViroL 222, 

15 4$4-49 (1996). Brains (including brainstems) were homogenized in diluent to 30% 

w/v, and right quadriceps were homogenized in diluent to 25 % w/v. Homogenates 
were handled and titered as described previously. Simpson et al., ViroL 222, 464- 
49 (1996). Bone marrow was harvested by crushing both femurs from each animal 
in sufficient diluent to produce a 30% w/v suspension (calculated as weight of 

20 uncrushed femurs in volume of diluent). Samples were stored at -70''C. For 
titration, samples were thawed and clarified by centrifugation at 1,000 x g for 20 
minutes at 4®C before being titered by conventional plaque assay on BHK-21 cells. 

EXAMPLES 
Tissue Preparation for In Situ Hybridization Studies 

25 Animals were anesthetized by ip injection of 0.5 ml AVERTIN* at 

various times post-inoculation followed by perfusion with 60 to 75 ml of 4% 
paraformaldehyde m PBS (pH 7.2) at a flow rate of 10 ml per minute. The entire 
carcass was decalcified for 8 to 10 weeks in 4% parafomaldehyde containing 8% 
EDTA in PBS (pH 6.8) at 4''C. This solution was changed twice during the 

30 decalcification period. Selected tissues were cut into blocks approximately 3 mm 

thick and placed into biopsy cassettes for paraffin embedding and sectioning. 
Blocks were embedded, sectioned and hematoxylin/eosin stained by Experimental 
Pathology Laboratories (Research Triangle Park, North Carolina) or North 
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Carolina State University Veterinary School Pathology Laboratory (Raleigh, North 
Carolina). 

EXAMPLE 9 
In Situ Hybridization 
Hybridizations were performed using a ["S]-UTP labeled S.A.AJR86 
specific riboprofae derived from pDS-45, Clone pDS^5 was constructed by first 
amplifying a 707 base pair fragment from pS55 by PCR using primers 7241 (5'- 
CTGCGGCGGATTCATCTTGC-3', SEQ ID NO:ll) and SC-3 (5'- 
CTCCAACTTAAGTG-3', SEQ ID N0:12). The resulting 707 base pair fragment 
was purified using a GENE CLEAN® kit (BiolOl. CA), digested with Hhal, and 
cloned into the Smal site of pSP72 (Promega). Linearizing pDS-45 with EcoR^ 
and performing an in vitro transcription reaction with SP6 DNA-dependent, RNA 
polymerase (Promega) in the presence of p^S]-UTP resulted in a riboprobe 
approximately 500 nucleotides in length of which 445 nucleotides were 
cornplementary to the S.A.AR86 genome (nucleotides 7371 through 7816). A 
riboprobe specific for the influenza strain PR-8 hemagglutinin (HA) gene was used 
as a control probe to test non-specific binding. The in situ hybridizations were 
performed as described previously (Charles et al, , ViroL 208, 662-71 (1995)) using 
10^ cpm of probe per slide. 

EXAMPLE 10 

Replication of S.A.AR86 in Bone Marrow 
Three groups of six adult mice each were inoculated peripherally 
(sc, ip, or iv) with 1200 PFU of S55 (a molecular clone of S. A.AR86) in 25 pX 
of diluent. Under these conditions, the infection produced no morbidity or 
mortality. Two mice from each group were anesthetized and sacrificed at 2, 4 and 
6 days post-inoculation by exsanguination. The semm, brain (including 
brainstem), right quadricep, and both femurs were harvested and titered by plaque 
assay. Virus was never detected in the quadricep samples of animals inoculated 
sc (Table 4). A single animal inoculated ip (two days post-inoculation) and two 
mice inoculated iv (at four and six days post-inoculation) had detectable virus in 
the right quadricep, but the titer was at or just above the limit of detection (6.25 
PFU/e tissue). Virus was present sporadically or at low levels in the brain and 
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serum of animals regardless of the route of inoculation. Virus was detected in the 
bone marrow of animals regardless of the route of inoculation. However, the 
presence of virus in bone marrow of animals inoculated sc or ip was more sporadic 
than animals inoculated iv, where five out of six animals had detectable virus. 
5 These results suggest that S55 targets to the bone marrow, especially following iv 

inoculation. 

The level and frequency of virus detected in the serum and muscle 
suggested that virus detected in the bone marrow was not residual virus 
contamination from blood or connective tissue remaining in bone marrow samples. 

10 The following experiment also suggested that virus in bone marrow was not due 

to tissue or serum contamination. Mice were inoculated ic with 1200 PFLJ of S55 
in 25 111 of diluent. Animals were sacrificed at 0.25, 0.5, 1, 1.5, 2, 3, 4, 5, and 
6 days post-inoculation, and the carcasses were decalcified as described in 
Example 8. Coronal sections taken at approximately 3 mm intervals through the 

15 head, spine (including shoulder area), and hips were probed with an S55-specific 

P^S]-UTP labeled riboprobe derived from pDS-45. Positive in situ hybridization 
signal was detected by one day post-inoculation in the bone marrow of the skull 
(data not shown). Weak signal also was present in some of the chondrocytes of 
the vertebrae, suggesting that S55 was replicating in these cells as well. Although 

20 the frequency of positive bone marrow cells was low, the signal was very intense 

over individual positive cells. This result strongly suggests that S55 replicates in 
vivo in a subset of cells contained in the bone marrow. 

EXAMPLE 11 

Other Sindbis Group Viruses 
25 It was of interest to determine if the ability to replicate in the bone 

marrow of mice was unique to S55 or was a general feature of other viruses, both 
Sindbis and non-Sindbis viruses, in the Sindbis group. Six 38-day-old female CD- 
1 mice were inoculated iv with 25 fil of diluent containing 10^ PFU of S55, 
Ockelbo82, Girdwood S.A., TR339, or TRSB. At 2, 4 and 6 days post- 
30 inoculation two mice from each group were sacrificed and whole blood, serum, 
brain (including brainstem), right quadricep, and both femurs were harvested for 
virus titration. 
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The results of this experiment were similar to those with S55. 
TRSB infected animals had no virus detectable in serum or whole blood in any 
animal at any time, and with the other viruses tested, no virus was detected in the 
serum or whole blood of any animal beyond two days post-inoculation (detection 
5 limit, 25 PFU/ml). Neither TRSB nor TR339 was detectable in the brains of 

infected animals at any time post-inoculation. S55, Girdwood S.A., and 
Ockelbo82 were present in the brains of infected animals sporadically with the 
titers being at or near the 75 PFU/g level of detection. All the tested vmises were 
found sporadically at or slightly above the 50 PFU/g detection limit in the right 
10 quadricep of infected animals except for a single animal four days post-inoculation 
with TRSB which had nearly 10^ PFU/g of virus in its quadricep. 

The frequency at which the different viruses were detected in bone 
marrow varied widely, with S55 and Girdwood S.A. being the most frequently 
isolated (five out of six animals) and Ockelbo82 and TRSB being the least 
15 frequently isolated from bone marrow (one out of six animals and two out of six 

animals, respectively) (Table 4). Girdwood S.A. and S55 gave nearly identical 
profiles in all tissues. Girdwood S.A., unlike S.A.AR86, is not neurovirulent in 
adult mice (Example 4), suggesting that the adult neurovirulence phenotype is 
distinct from the ability of the virus to replicate efficiently in bone marrow. 
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EXAiMPLE 12 ■ ' . . 

Virus Persistence in Bone Marrow 
The next step in our investigations was to evaluate the possibility 
that S,A.AR86 persisted long-term in bone marrow. S51 is a molecularly cloned, 
attenuated mutant of S55. S51 differs from S55 by a threonine for isoleucine 
5 substitution at amino acid residue 538 of nsPl and is attenuated in adult mice 
inoculated intracerebraUy. Like S55, S51 targeted to and replicated in the bone 
marrow of 37-day-oId female CD-I mice following ic inoculation. Mice were 
inoculated ic with 500 PFU of S51 and sacrificed at 4, 8, 16, and 30 days post- 
inoculation for determination of bone marrow and serum titers. At no time post- 
10 inoculation was virus detected in the serum above the 6.25 PFU/ml detection limit. 
Virus was detectable in the bone marrow samples of both animals sampled at four 
days post-inoculation and in one animal eight days post-inoculation (Table 5). No 
vurus was detectable by titration on BHK-21 cells in any of the bone marrow 
samples beyond eight days post-inoculation. These results suggested that the 
15 attenuating mutation present in S5 1 , which reduces the neurovimlence of the virus, 
did not impair acute viral replication in the bone marrow. 



It was notable that the plaque size on BHK-21 cells of virus 
recovered on day 4 post-inoculation was smaller than the size of plaques produced 
by the inoculum virus, and that plaques produced from virus recovered from the 
20 day 8 post-inoculation samples were even smaller and barely visible. This 
suggests a strong selective pressure in the bone marrow for virus that is much less 
efficient m forming plaques on BHK-21 cells. 



To demonstrate that S51 virus genomes were present in bone 
marrow cells long after acute infection, four to six-week-old female CD-I mice 

25 were inoculated ic with 500 PFU of S51. Three months post-moculation two 
animals were sacrificed, perfused with paraformaldehyde and decalcified as 
described in Example 8. The heads and hind limbs from these animals were 
paraffin embedded, sectioned, and probed with a S.A.AR86 specific p^S]-UTP 
labeled riboprobe derived from clone pDS-45. In situ hybridization signal was 

30 clearly present in discrete cells of the bone and bone marrow of the legs (data not 
shown). Furthermore, no in situ hybridization signal was detected in an adjacent 
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control section probed with an influenza virus HA gene specific riboprobe. As the 
relative sensitivity of in situ hybridization is reduced in decalcified tissues (Peter 
Charles, personal conununication), these cells likely contain a relatively high 
number of viral sequences, even at three months post-inociilation. No in situ 
5 hybridization signal was observed in mid-sagital sections of the heads with the 
S.A.AR86 specific probe, although focal lesions were observed in the brain 
indicative of the prior acute infection with S51. 



TABLE 5 



Days Post- 
Inoculation 


Titers (Total PFU/Animal") 


Limit of 
Detection 


Animal A 


Animal B 


4 


2100 


380 


62.5 


8 


62.5 


N.D.» 


62.5 


16 


N.D. 


N.D. 


62.5 


30 


N.D. 


N.D. 


62.5 



■ "N.D." indicates that the virus titers were below the limit of detection. 
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Example 13 

Replication of S.A.A.R86 within H one/Joint Tissue of Adult 7Vrirt> 

Several old world alphaviruses, including Ross River Virus, 
Chikungunya virus. Okelbo82, and S.A.AR86 are associated with acute and persistent 
5 arthritis/amiralgia in humans. Molecular clones of several Sindbis group viruses, 
including S.A.AR86.. were used to investigate alphavirus replication within bone/joint 
tissue. 

Following intravenous inoculation of S.A.AR86 into adult CD-I mice, 
viral replication was observed in bone/joint tissue, but not surrounding muscle tissue of 

10 the hind limbs. Infectious virus was detectable 24 hrs post-infection; however, viral 
titer within bone/joint tissue was maximal 72 hours post-infection. Fractionation of 
hind limbs from infected animals revealed that the hip and knee joints were the 
predominant sites of viral repUcation. RepUcation withm bone/joint tissue appears to be 
a conomon trait of Sindbis-group viruses, since the laboratory strains TR339 and TRSB 

15 also repUcated within bone/joint tissue. In situ hybridization and S.A.AR86 based 
double promoter vectors expressing green fluorescent protein were used to further 
localize S.A.AR86 infected cells within bone/joint tissue. Green fluorescent protein 
expression was detected in bone/joint tissue for at least one month post-moculation. 
These smdies demonstrated that cells within the endosteum of synovial joints were the 

20 predominant site of S,AAR86 replication. 
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TH AT WHICH IS CLAIMED IS: 

1. A method of introducing and expressing heterologous RNA 
in bone marrow cells, comprising: 

(a) providmg a recombinant alphavirus, said alphavirus 
containing a heterologous RNA segment, said heterologous RNA segmrat 
comprising a promoter operable in said bone marrow cells operatively associated 
with a heterologous RNA to be expressed in said bone marrow cells; and then 

(b) contacting said recombinant alphavirus to said bone marrow 
cells so that said heterologous RNA segment is introduced and expressed therein. 

2. A method according to claim 1 , wherein said contacting step 
is carried out in vitro. 

3 . A method according to claim 1 , wherein said contacting step 
is carried out in vivo in a subject in need of such treatment. 

4. A method according to claim 1, wherein said heterologous 
RNA encodes a protein or peptide. 

5. A method according to claim 1, wherein said heterologous 
RNA encodes an immunogenic protein or peptide. 

6. A method according to claim 1, wherein said heterologous 
RNA encodes an antisense oligonucleotide or a ribozyme. 

7. A method according to claim 1, wherein said alphavirus is 
an Old World alphavirus. 

8. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of SF group and SIN group alphaviruses. 
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9. A method according to claim 1 , wherein said alphavirus is 
selected from the group consisting of Semliki Forest virus, Middelburg virus, 
Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, Barmah Forest 
virus, Getah virus, Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, 

5 Sindbis virus. South African Arbovuiis No. 86, Ockelbo vims, Girdwood S.A. 

virus. Aura virus, Whataroa virus, Babanki virus, and Kyzylagach virus. 

10. A method according to claim 1, wherein said alphavirus is 
South African Arbovirus No. 86. 

11. A method according to claim 1, wherein said alphavirus is 
10 Girdwood S.A. 

12. A method according to claim 1, wherein said alphavirus is 
Sindbis strain TR339. 

13. A helper cell for expressing an infectious, propagation 
defective, Girdwood S.A. virus particle, comprising, in a Girdwood S.A.- 

15 permissive cell: 

(a) a first helper RNA encoding (i) at least one Girdwood S.A. 
structural protein, and (ii) not encoding at least one other Girdwood S.A. structural 
protein; and 

(b) a second helper RNA separate from said first helper RNA, 
20 said second helper RNA (i) not encoding said at least one Girdwood S.A.. 

structural protein encoded by said first helper RNA, and (ii) encoding said at least 
one other Girdwood S.A. structural protein not encoded by said first helper RNA, 
and with all of said Girdwood S.A. structural proteins encoded by said first and 
second helper RNAs assembling together into Girdwood S.A. particles in said cell 
25 containing said replicon RNA; 

and wherein the Girdwood S.A. packagmg segment is deleted from 
at least said first helper RNA. 
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14. The helper cell according to claim 13, further conuining a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherem said Girdwood S.A. packaging segment is deleted from at 
least one of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
second helper RNA are all separate molecules from one another. 

15. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherem said replicon RNA and said first helper RNA are separate 

molecules; 

and wherem the molecule containing said replicon RNA further 
contains RNA encoding said at least one Girdwood S.A. structoral protein not 
encoded by said furst- helper RNA. 

16. The helper cell according to claim 13, wherein said first 
helper RNA encodes both the Girdwood S.A. El glycoprotein and the Girdwood 
S.A. E2 glycoprotein, and wherein said second helper RNA encodes the Girdwood 
S.A. capsid protein. 

17. A method of making infectious, propagation defective, 
Girdwood S.A. virus particles, comprising: 

transfecting a Girdwood S.A.-permissive cell according to claim 13 
with a propagation defective replicon RNA, said replicon RNA including said 
Girdwood S.A. packaging segment and an inserted heterologous RNA; 

producing said Girdwood S.A. virus particles in said transfected 

cell; and then 

collecting said Girdwood S.A. vims particles from said cell. 
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ls. Infectious Girdwood S.A. viras particles produced by the 

method of Claim 17. 

19. Infectious Girdwood S.A. virus particles rcniaining a 
replicon RNA encoding a promoter, an inserted heterologous RNA, and wherein 

5 RNA encoding at least one Girdwood S.A. stmctural protein is deleted therefrom 

so that said Girdwood S,A. virus particle is propagation defective. 

20. A pharmaceutical formulation comprising infectious 
Girdwood S.A. virus particles according to claim 18 or 19 in a pharmaceutically 
acceptable carrier. 

21. A helper cell for expressing an infectious, propagation 
defective, TR339 virus particle, comprising, in a TR339-permissive cell: 

(a) a first helper RNA encoding (i) at least one TR339 structural 
protein, and (ii) not encoding at least one other TR339 structural protein; and 

(b) a second helper RNA separate from said first helper RNA, 
said second helper RNA (i) not encoding said at least one TR339 structural protein 
encoded by said first helper RNA, and (ii) encoding said at least one other TR339 
structural protein not encoded by said first helper RNA, and with all of said 
TR339 structural proteins encoded by said first and second helper RNAs 
assembling together into TR339 particles in said cell containing said replicon 
RNA; 

and wherein the TR339 packaging segment is deleted from at least 
said first helper RNA. 

22. The helper cell according to claim 21, further containing a 
replicon RNA; 

25 said replicon RNA encoding said TR339 packaging segment and an 

inserted heterologous RNA; 

wherein said TR339 packaging segment is deleted from at least one 
of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
30 second helper RNA are all separate molecules from one ianother. 
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23. Tne helper cell according to claim 21, farther containing a 

replicon RNA; 

said replicon RNA encoding said TR339 packaging segment and an 
insened heterologous RNA; 

5 wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one TR339 structural protein not encoded by 
said first helper RNA. 

1® 24. The helper cell according to claim 21, wherein said first 

helper RNA encodes both the TR339 El glycoprotein and the TR339 E2 
glycoprotein, and wherein said second helper RNA encodes the TR339 capsid 
protein. 

25. A method of making infectious, propagation defective, 
15 TR339 virus particles, comprising: 

transfecting a TR339-permissive cell according to claini 21 with a 
propagation defective replicon RNA, said replicon RNA mcluding said TR339 
packaging segment and an inserted heterologous RNA; 

producing said TR339 virus particles in said transfected cell; and 

20 then 

collecting said TR339 virus particles from said cell. 

26. Infectious TR339 virus particles produced by the method of 

Claim 25. 



27. Infectious TE1339 virus panicles containing a replicon RNA 
25 encoding a promoter, an inserted heterologous RNA, and wherein RNA encoding 
at least one TR339 structural protein is deleted therefrom so that said virus particle 
is propagation defective. 



28. A pharmaceutical formulation comprising infectious TR339 
virus particles according to Claim 26 or 27 in a pharmaceutically acceptable carrier. 

SUBSTITUTE SHEET (RULE 25) 



wo 98/36779 PCT/US98/02945 

-55- 

29. A recombinant DNA comprising a cDNA coding for an 
infectious Girdwood S.A. virus RNA transcript and a heterologous promoter 
positioned upstream from said cDNA and operatively associated therewith. 

30. An mfectious RNA transcript encoded by a cDNA according 

5 to claim 29. 

31. An mfectious RNA according to claun 30, said infectious 
Girdwood S.A. RNA transcript contaimng a heterologous RNA segment, said 
heterologous RNA segment comprismg a promoter operably associated with a 
heterologous RNA. 

32. Infectious viral particles containing an RNA transcript 
according to claim 30. 

33. A recombinant DNA comprising a cDNA coding for a 
Sindbis strain TR339 RNA transcript and a heterologous promoter positioned 
upstream from said cDNA and operatively associated therewith. 

15 34. An infectious RNA transcript encoded by a cDNA according 

to claim 33. 

35. An infectious RNA according to claim 34, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 

20 heterologous RNA. 

36. Infectious viral particles containing an RNA transcript 
according to claim 34. 
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Nucleotide Sequence of S.A.AR86 



I ATrCGCCGCC TAGTACACAC TATTOAATCA AACAGCCCAC CAATTCCACT ACCATCACAA TCCACAAGCC AGTAGTTAAC CTAGACGTAG ACCCTCAGAO 
lot ILLUUIblL OTOCAACTCC AAAAOACCTT CCCGCAATTT CACGTACTTAa CACAGCACCT CACTCCAAAT GACCATGCTA ATGCCACAGC ATTTTCGCAT 
201 CTCGCCAGTA AACTAATCGA GCTGGACGTT CCTACCACAG COACGATnT CQACATAGCC ACCGCACCCG CTCGTACAAT GTTTrCCCAG CACCAGTACC 
301 ATTCCGTTTO CCCCAT GCO T AGTCCACAAO ACCCGCACOG CATGATGAAA TATCCCACCA AACT CG CCCA AAAACCATCT AAGATTACAA ACAACAACTT 
401 CCATGACAAG ATCAACGACC TCCCGACCCT ACTTCATACA CCGCATCCTO AAACCCCATC actctocttc cacaaccato ttacctgcaa cacccctccc 
501 GAGTACTCCG TCATGCAGCA CCTGTACATC AACCCTCCCG CAACTATTTA CCACCAGGCT ATGAAAGCCC TCCCGACCCT CTACTGCATT GGCTTCCACA 
601 CCACCCAGIT CAT bllCICG GCTATQCCAG GTTCGTACCC TGCATACAAC ACCAACTCCC CCCACGAAAA ACTTCCTTGAA GCCCGTAACA TCGGACTCTG 
TO! CACCACAAAC CTGAGTGAAO GCAGGACAGG AAAGTTCTCC ATAATGAGGA AGAAGGAGTT GAACCCCGGG TCACCGOnr ATTTcrccor TCGATCQACA 
801 CnTACCCAO AACACAGACC CACCTTGCAG ACCTOOCATC TTCCATCCCT GTTCCACTTG AAAGGAAACC AGTCCTACAC TTCCCCCTCT CATACAGTCG 
901 TCAGCrCCGA AGCCTACCTA GTGAAGAAAA TCACCATCAG TCCCCCCATC ACCGGACAAA CCCTCGCATA CCCCCITACA AACAATACCG ACCCCITCTT 
1001 CCTATCCAAA CTTACCGATA CAGTAAAAGG AGAACGCCTA TCOTTCCCCG TOTCCACCTA TATCCCGGCC ACCATATGCG ATCAGATGAC CCGCATAATO 
1 101 CCCACGGATA TCTCACCTGA CGATCCACAA AAACrTCTCC TTGCGCTCAA CCACCCAATC GTCATTAACC GTAACACTAA CACGAACACC AATACCATGC 
1201 AAAATTACCT TCTGCCAATC ATTCCACAAG GCTTCACCAA ATGGGCCAAG CACCCCAAAG AAGATCTTCA CAATCAAAAA ATCCTCCCCA CCAGAGAGCG 
1301 CAAGCTTACA TATGCCTGCT TGTGGCCGTT TCGCACTAAG AAACTGCACT COTTCTATCC CCCACCTCGA ACGCACACCA TCGTAAAACr CCCAOCCTCr 
1401 nTACCCCTT TCCCCATGTC ATCCCTATGC ACTACLILI I TCCCCATGTC CCTCAGGCAG AAGATGAAAT TGCCATTACA ACCAAAGAAG GAGGAAAAAC 
1501 TGCTCCAAGT CCCGGAGGAA TTACnTATCG AGGOCAAGCC TCCnTCGAG CATCCTCACG AGGAATCCAC AGCGGAGAAG CTCCGAGAAG CACTCCCACC 
1601 ATTAGTGCCA CACAAAGCTA TCGACGCACC TCCGCAAGTT GTCTCCCAAG TGGAGCCGCT CCACCCGGAC ACCGCACCAG CACTCCTCGA AACCCCCCGC 
1701 CCTCATCTTAA GGATAATACC TCAACCAAAT GACCCTATGA TCGCACAGTA TATCGTTCTC TCGCCGATCT CTGTGCTGAA GAACGCTAAA CTCGCACCAG 
IflOl CACACCCGCT ACCAOACCAG GTTAAGATCA TAACGCACTC CGGAAGATCA CGAACCTATG CAGTCGAACC ATACGACGCT AAACTACTGA TCCCACCAGG 
1901 AAOTCCCCTA CCATGCCCAG AATTCTTAGC ACTGACTGAG AGCCCCACCC TTOTOT A CAA CGAAAOAGAG UIOILA ACC GCAACCTGTA CCATATTCCC 
2001 ATCCACCCTTC CCGCTAAGAA TACAGAAGAG GAGCACTTACA ACGTTACAAA GCCAGACCTC GCAOAAACAG AGTACCTCTT TCACGTGGAC AAGAAGCCAT 
2101 CCCTTAAGAA GCAAGAAGCC TCACGACTTG TCCTTrCCCG ACAACTCACC AACCCGCCCT ATCACGAACT AGCTCXTGAG GGACTGAAGA CTCGACCCCC 
2201 CCTCCCGTAC AAGCntSAAA CAATACCAGT GATACGCACA CCAGCATCCG CCAAGTCACC TATCATCAAG TCAACTCTCA CGCCACCTGA TCTTGTTACC 
2301 AGCCGAAACA AAGAAAACTG CCGCGAAATT CAGGCCCACG TCCTACGCCT CACCGCCATG CACATCACGT CGAAGACAGT CCATTCCGTT ATCCTCAACC 
3401 GATCCCACAA AOCCCTAGAA CrGCTCTATG TTGACGAAGC CTTCCCGTCC CACGCAGGAC CACTACrTGC CrrGATTGCA ATCGTCAGAC CCCGTAAGAA 
2501 CGTAGTACTA TGCGOAGACC CTAAGCAATG CCGATrCTTC AACATCATCC AACTAAAGGT ACATTTCAAC CACCCTCAAA AAGACATATG TACCAAQACA 
2601 TTCrACAAGTTTATCTCCCC ACGTTCCACA CACCCAGTCA CGGCTATrCT ATCGACACTG CATTACGATG GAAAAATCAA AACCACAAAC CCGTOCAAGA 
2701 AGAACATCGA AATCCACATT ACAGGGGCCA CGAAGCCGAA CCCAGGCGAC ATCATCCTGA CATCTTTCCG CCCGTGCaTT AACCAACTGC AAATCCACTA 
U0» TCCCCCACAT CAGGTAATGA CAGCCCCCCC CTCACAAGGG CTAACCACAA AAGCACTATA TGCCCTCCCG CAAAAAGTCA ATCAAAACCC CCTCTACGCC 
2901 atcacatcag ACCATCTGAA CCTCTTCCTC ACCCGCACTC ACGACAGCCT AGTATCCAAA ACTTTACACC gcgacccatg cattaaccag ctcactaacc 
3001 TACCTAAACC AA AllMCA G GCCACCATCC AGCACTGCGA AGCTGAACAC AAGGGAATAA TTCCTGCCAT AAACACTCCC CCTCCCCCTA CCAATCCGTT 
3101 CAGCTGCAAG ACTAACCTTT CCTCGCCOAA AGCACTCGAA CXXfATACTCG CCACGCCCCG TATCCTACTT ACCCCTTCCC AGTGGAGCGA GCTCTTCCCA 
3201 CACTTTCCGG ATGACAAACC ACACTCGGCC ATCTACCCCT TACACGTAAT TTCCATTAAC M MiC CCCA TCCACTTCAC AAGCGGCCTG TTTTCCAAAC 
3301 ACACCATCCC GTTAACGTAC CATCCTCCCC ACTCAGCCAG CCCACTACCT CATTCCCACa ACACCCCAGG aacacgcaag tatccctacc atcaccccct 
3401 TGCCGCCGAA CTCTCCCGTA CATTTCCCCT GTTCCAGCTA GCrCCCAAAC GCACACACCT TCATTTCCAG ACGGGCACAA CTAGAGTrAT CTCTCCACAG 
3301 CATAACTTCG TCCCACTCAA CCGCAATCTC CCrCACCCCT TAGTCCCCCA CCACAACGAG AAACAACCCG CCCCCCTCCA AAAATTCTTC AGCCAGTTCA 
3601 AACACCACTC CCTACTTCTC ATCTCAGAGA AAAAAATTGA ACCTCCCCAC AAGAGAATCG AATGGATCGC CCCCATTCOC ATACCCGCCG CAGATAACAA 
3701 CTACAACCrC GCTTTCCCCT TTCCCCCGCA CCCACCCTAC CACCTGCTCT TCATCAATAT TCCAACTAAA TACACAAACC ATCACTTTCA ACAGTGCCAA 
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3a01 GACCACGCCG CGACCTTGAA AACCCTTTCC CCTTCCCCCC TCAACTCCCT TAACCCCCCA CCCACCCTCC TCCTCAACTC CTACCCITAC GCCGACCGCA 
3901 ATAGTCACGA CCTACTCACC CCTCTTCCCA GAAAATTnJT CACAGTCTCT CCACCGACCC CAGAGTCCCTr CTCAAGCAAT ACAGAAATCT ACCTOATTTr 
4001 CCCACAACTA CACAACAGCC CCACACCACA ATTCACCCCC CATCATTTCA AtTCTCTCAT TTCGTCCCTC TACOAGCGTA CAACAGACCG AGTTCGAGCC 
4101 GCACCGTCGT ACCCTACTAA AACGGAGAAC ATTCCTCATT CTCAACACCA ACCACTTCTC AATCCAGCCA ATCCACTCCG CAGACCAGGA GAAGOAOI^ 
4201 GCCGTCCCAT CTATAAACGT TGGCCGAACA GTTTCACCGA TTCAGCCACA GAGACAGGTA CCGCAAAACT CACTCTCTCC CAAGGAAAGA AAOTCATCCA 
4301 CGCCCTTCGC CCTCATTTCC CGAAACACCC ACACGCAGAA CCCCXXiAAATTGCTCCAAAA cccctaccat gcactggcag acttagtaaa tgaacataat 
4401 ATCAAGTCrC TCGCCATCCC ACTGCTATCT ACAGCCATTT ACGCAGCCCG AAAAGACCCC CTTGAGCTAT CACTTAACTC CTTGACAACC GCGCTACACA 
4S0t GAACTGATGC CCACGTAACC ATCTACTCCC TGCATAAGAA GTGGAAGGAA AQAATCGACG CCGTGCTCCA ACTTAAGGAG T C T O T A ACTG ACCTCAAGGA 
4601 TGACGATATO GACATCGACG ACCACTTAGT ATGCATCCAT CCCCACACTT GCCTCAACCC AAGAAAGCCA TTCACTACTA CAAAAGOAAA GTTGTATrCC 
4701 TACTTTGAAG CCACCAAATT CCATCAACCA CCAAAAGATA TCGCCGAGAT AAAGGTCCrC TTCCCAAATG ACCACGAAAG CAACCAACAA CTGTGTCCCT 
4801 ACATATTGCC GCAGACCATG CAACCAATCC GCCAAAAATG CCCGGTCCAC CACAACCCCT CCTCTACCCC GCCAAAAACG CTCCCGTCCC TCrCTATCTA 
4901 TCCCATOACC CCAGAAAGCG TCCACAGACT CAGAAGCAAT AACCTCAAAG AACTTACACT ATCCTCCT CC ACCCCCCTTC CAAACTACAA AATCAAGAAT 
SOOl GTTCACAAGG TFCAOTQCAC AAAAGTACTC CTGTTTAACC CCCATACCCC CGCATrCGTT CCCGCCCCTA AGTACATACA AGCACCAGAA CAGCCTGCAG 
SIOI CTCCCCCTGC ACACCCCGAG GACCCCCCCG CACTTCTAGC GACACCAACA CCACCTCCAC CTCATAACAC CTCCCTTCAT CTCACCOACA TCTCACTGOA 
5201 CATGCAAGAC AGTAGCGAAG GCTCACTCTT TTCCAGCnT AGCGGATCCG ACAACTACCG AACGCAGCTG GTCGTGGCTG ACGTCCATGC CGTCCAAGAG 
5301 CCTGCCCCTG TTCCACCCCC AAGCCTAAAG AAGATGGCCC GCCTCGCACC GCCAAGAATG CACCAAGAGC CAACFCCACC GGCAAGCACC AGCTCTCCGG 
5401 ACCAGTCCCT TCACCTTTCT nTGATGGCO TATCTATATC CTTCOCATCC CTTTTCGACG CAGAGATGGC CCGCTTCGCA GCCGCACAAC CCCCCGCAAG 
5501 TACATGCCCT ACGCATGTCC CTATGTCnT CCCATCCTTT TCCGACGCAC ACATTGAGGA GTTGAGCCCC AGAGTAACCG ACTTCGGAGCC CCilCCHilH 
5«l CCCTCATTTO AACCCCGCGA AGTCAACTCA ATTATATCGT CCCGATCAGC CCTATLI 111 CCACCACGCA AGCACAGACG TAGACCCACC AGCAGQAGGA 
5701 CCCAATACTG TCTAACCGCC GTAGGTCGGT ACATATTTTC GACGGACACA GGCCCTCGCC ACTTGCAAAA GAAGTCCCTT CTGCAGAACC ACCTTACACA 
5801 ACCCACCTTG GAGCGCAATC TTCTCCAAAG AATCTACGCC CCGGTGCFCG ACACCTCGAA acaccaacag ctcaaactca ggtaccaqat gatgcocacc 
5901 OAAGCCAACA AAAGCACGTA CCACTCTCGA AAACTTAGAAA ACCAGAAAGC CATAACCACT GAGCGACTCC TTTCAGGGCT ACGACTCTAT AACTCTCCCA 
6001 CAGATCAGCC AGAATCCTAT AACATCACCT ACCCCAAACC ATCGTATTCC AGCAGTGTAC CACCGAACTA CTCrCACCCA AACnrCCTC TAGCTcnrc 
6101 TAACAACTAT CTGCATGAGA ATTACCCGAC CCTAGCATCT TATCACATCA CCGACGAGTA CCATCCTTACTrCCATATCGTAGACGGGAC ACTCCCTTCC 
6201 CTAGATACrC CAA LI U 1 10 CCCCGCCAAG CTTAGAAGTT ACCCGAAAAG ACACGAGTAT AGAGCCCCAA ACATCCCCAG TGCCCTTCCA TCAGCOATCC 
6301 AGAACACGTT GCAAAACCTC CTCATTGCCC CGACTAAAAO AAACTGCAAC CTCACACAAA TCCGTGAACT GCCAACACTG CACTCAGCGA CATrCAACGT 
6401 TCAATCCTTT CGAAAATATC CATGCAATGA CCAGTATTGO GAGCACTTTG CCCGAAACCC AATTAGGATC ACTACTGAGT TCCTTACCCC ATACCTGCCC 
6S01 AGACTGAAAG GCCCTAAGGC CGCCCCACTC TTCGCAAACA CGCATAATTT GCTCCCATTG CAACAAGTGC CTATGGATAG ATTCGTCATO GACATGAAAA 
6601 GAGACCTGAA AGTTACACCT CGCACCAAAC ACACAGAAGA AAGACCGAAA GTACAAGTCA TACAAGCCGC AGAACCCCTG GCGACCGCTT ACCTATGCGG 
6701 CATCCACCCG GAGTTACTGC GCAGCCTTAC ACCCCTTTTO CTACCCAACA TTCACACGCT CTTTCACATG TCGGCCGAGG ACTTTCATCC AATCATACCA 
6801 CAACACTTCA ACCAAGGTCA CCCCGTACTC GAGACCGATA TCCCCTCC7TT CGACAAAAGC CAACACGACG CTATCCCGTT AACCGCCCTG ATGATCTTGC 
'6901 aagacctggg TCTCGACCAA CCACTACTCC ACTTGATCCA CTGCGCCTTT CGAGAAATAT CATCCACCCA TCTGCCCACG CCTACCCGTT TCAAATTCGG 
■JOOl GGCGaTCATG AAATCCGGAA TCTTCCrCAC CCTCnTGTC AACACAGTTC TCAATGTCGT TATCCCCAGC AGAGTATTGG AGGACCCGCT TAAAACGTCC 
7101 AAATGTGCAO CATTTATCGG CGACGACAAC ATTATACACC GAGTAGTATC TGACAAACAA ATCGCTGAGA CGTGTCCCAC CTCGCTCAAC ATGGACCITA 
7201 AGATCATTGA CCCAGTCATC CCCGAGAGAC CaCCTTACTT CTCCGCTCGA TrCATCTTCC AACATTCCGT TACCTCCACA CCCTCTCCCG TCGCCGACCC 
7301 CTTGAAAAGG Clblll AAGT TCGGTAAACC CCTCCCAGCC CACGATGACC AAGACGAACA CAGAAGACGC GCTCTCCTAG ATGAAACAAA GCCGTCCTTT 
7401 AGAGTAGGTA TAACAGACAC CTTAGCAGTa CCCGTGGCAA CTCGGTATGA CGTAGACAAC ATCACACCTG TCCTGCTCGC ATTGACAACT 1 1 fCCCCACA 
7501 GCAAAAGAGC ATTTCAAGCC ATCAGAGGCG AAATAAAGCA TCTCTACGCT GGTCCTAAAT AGTCAGCATA GTACATTTCA TCTGACTAAT ACCACAACAC 
7ai CACCACCATG AATAGACGAT TCTTTAACaT GCTCGCCCCC CGCCCCTTCC CACCCCCCAC TCCCATGTCG ACCCCGCCCA GAACGACCCA GGCCCCCCCG 
7701 ATGCCTGCCC GCAATGGGCT GGCTTCCCAA ATCCAGCAAC TGACCACAGC CGTCACTCCC CTAGTCATTG GACAGGCAAC TAGACCTCAA ACCCCACCCC 
7801 CACCCCCCCC GCCGCGCCAG AAGAAGCAGG CGCCAAAGCA ACCACCCAAC CCGAAGAAAC CAAAAACACA GGAGAAGAAG AACAAGCAAC CTCCAAAACC 
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7901 CAAACCCCGA AACAGACACC CTATOCCACT TAAGTTGCAG GCCCACAGAC TCTTCCACCT CAAAAATGAC GACGGAGATC TCATCCGCCA CCCACTCCCC 
8001 ATGGAAGGAA AGGTAATOAA ACCACTCCAC GTGAAAGGAA CTATTGACCA CCCTCTGCTA TCAAAGCTCA AATTCACCAA CTCCTCACCA TACGACATGG 
tlOl AGTTCCCACA GTTCCCCCTC AACATCAGAA CTCAGCCCTT CACCTACACC ACTCAACACC CTCAAGGCTT CTACAACTGG CACCACCGAG CGCTCCAGTA 
COI TACTCGACGC AGATITACCA TCC O C C G C CO ACTAGGACCC AGAGGACACA CTCCTCCTCC GATTATGGAT AACTCAGCCC GCGTTGrCGC GATAGTCCTC 
8301 GGAGGCGCTG ATGAGGGAAC AACAACCOCC CTTTCOaTCO TCACCTGCAA TACCAAAGCC AACACAATCA AGACAACCCC CGAACGGACA CAAGACTGGT 
8401 CTCCTCCACC ACTGCTrCACC CCCATCTCCT TCCTTGCAAA CCTCACCTTC CCATCCAATC GCCCGCCCAC ATCCTACACC CCCGAACCAT CCACACCTCT 
8501 CGACATCCTC GAAGAGAACG TGAACCACGA CGCCTACGAC ACCCTCCTCA ACGCCATATT GCGCTCCGGA TCCTCCGCCA GAAGTAAAAG AAGCGTCACT 
8601 CACGACTTTA CCTTCACCAG CCCCTACTTG GGCACATGCT COTACTCTCA CCATACTCAA CCCTCCTTTA CCCCGATTAA CATCCAGCAG CTCTCGGATG 
8101 AAGCCOACGA CAACACCATA CCCATACAGA CTTCCCOOCA GTTTCGATAC GACCAAAGCG GAGCACCAAG CTCAAATAAC TACCCCTACA T C T C CCTC O A 
UOl CCAGCATCAT ACTGTCAAAG AACGCACCAT GGATGACATC AAQATCACCA CCTCACCACC CTCTAGAAGG CTTAGCTACA AACGATACrr TCTCCTXrCCG 
8901 AACTGrCCTC CACGGGACAG CGTAACCCTT AGCATAGCCA GTAGCAACTC AGCAACGTCA TCCACAATCG CCCCCAAGAT AAAACCAAAA TTCGTCGGAC 
9001 GGGAAAAATA TCACCTACCT CCLOl ICACG GTAAGAACAT TCCTTGCACA OTGTACCACC GTCTGAAAGA AACAACCCCC CCCTACATCA CTATGCACAC 
9101 GCCGGGACCG CATGCCTATA CATCCTATCT CGAGQAATCA TCAGGOAAAC TTTACCCCAA CCCACCATCC CGCAAGAACA TTACGTACGA GTCCAAGTGC 
9201 GGCGATTACA ACACCGGAAC CGTTACGACC CCTTACCGAAA TCACCGGCTC CACCCCCATC AACCAGTGCG TCCCCTATAA GAGCCACCAA ACGAAGTGCG 
9301 TCTTCAACTC GCCCGACTCG ATCAGACACG CCCACCACAC GCCCCAACCG AAATTCCATT TCCCTTTCAA GCTOATCCCO ACTACCTCCA TCCTCCCrCTr 
9401 TGCCCACCCQ CCGAACCTAG TACACGOCTT TAAACACATC AGCCTCCAAT TAGACACACA CCATCTGACA TTOCTCACCA CCACCACACT ACGGCCAAAC 
9501 CCCGAACCAA CCACTGAATG GATCATCGGA AACACCGTTA GAAACITCAC CCTrCGACCCA GATGCCCTCG AATACATATG CGCCAATCAC GAACCAGTAA 
9601 GGGTCTATCC CCAAGAGTCT CCACCAGCAG ACCCTCACGG ATGGCCACAC GAAATACTAC ACCATTACTA TCATCGCCAT CCTCTC7TACA CCATCTTAGC 
9701 CCTCCCATCA GCTCCTGTGG CGATGATGAT TCGCGTAACT GTTCCACCAT TATCTCCCTC TAAACCCCGC CGTCACTCCC TOACGCCATA TGCCCTCCCC 
9801 CCAAATGCCG TGATTCCAAC TTCGCTGCCA CmTCTCCT CTGTTAGCrC CCCTAATGCT CAAACATTCA CCGAGACCAT GAGTTACTTA TCCTCGAACA 
' 9901 CCCAGCCGTT CTTCTGGGTC CACCTGTGTA TACCTCTCCC CGGTCTCCTC CTTCTAATGC GCTCTTCCTC ATGCTCCCTG CCl 1 11 1 lAG TCCTTCOCCC 
10001 CCCCrACCrC GCGAAGGTAQ ACGCCTACCA ACATGCGACC ACTCTTTCCAA ATCraCCACA GATACCGTAT AAGCCACTTG rrOAAAGGGC AGGGTACOCC 
lOlOl CCCCTCAATT TCGAGATTAC TGTCATCTCC TCGGAGOTTT TCCCTTCCAC CAACCAAGAG TACATTACCT GCAAATTCAC CA CI ' OIG CTC CCCTCCCCTA 
10201 AACTCAGATG CTCCGGCTCC TTCCAATGTC AGCCCGCCCC TCACCCAGAC TATACCTCCA AC OIClllGG AGGGCTCTAC CCCTTCATCT GGGGAGCAGC 
10301 ACAATG 1111 TGCGACACTO AGAACAGCCA CATGAGTCAC GCCTACCTCG AATTCTCACTT AGATTGCCCG ACTGACCACG CGCAGCCGAT TAACCTGCAT 
10<0I ACTGCCGCGA TCAAAGTAGG ACT GCO T A TA GTGTACCOGA ACACTACCAG TTTCCTACAT GTCTACGTGA ACGCACTCAC ACCAGGAACG TCTAAAGACC 
lOSOt TCAAACTCAT ACCTCCACCA ATTTCACCAT TCTTTACACC ATTCGATCAC AACCTCCTTA TCAATCCCGG CCTGCTGTAC AACTATGACT TTCCGCAATA 
10601 CGGAGCGATG AAACCAGGAG C O! HU G AG A CATTCAACCT ACCTCCTTGA CTAGCAAACA CCTCATCGCC ACCACAGACA TTAGCCTACT CAAGCCTTCC 
10701 GCCAAGAACG TGCATGTCCC CTACACGCAG GCCCCATCTG GATTCGAGAT CTGGAAAAAC AACTCAGGCC GCCCACTCCA GGAAACCGCC CCTTTTGGCr 
10801 CCAAGATTGC AGTCAATCCG CTTCGACCCG TGCACTGCTC ATACGGCAAC ATTCCCATTT CTATTGACAT CCCGAACCCT GCCTTTATCA GCACATCAGA 
10901 TCCACCACTG CTCTCAACAG TCAAATCTGA TGTCACTCAG TGCACTTATT CAGCCGACTT CCGACGGATC CCTACCCTCC ACTATGTATC CGACCCCCAA 
UOOl GGACAATCCC CTGTACATTC GCATTCGAGC ACAGCAACCC TCCAAGACTC GACAGTTCAT GTCCTGGACA AAGGAGCGGT GACAGTACAC TTCACCACCG 
1 1 101 CGACCCCACA CGCOAACTTC ATTCTATCCC TCTGTCGTAA GAAGACAACA TCCAATGCAC AATCCAAACC ACCAGCTGAT CATATCGTGA CCACCCCCCA 
1 1201 CAAAAATGAC CAAGAATTCC AAGCCCCCAT CTCAAAAACT TCATGGACTT CCCTCTTTCC CCmrCCGC CGCCCCTCCT CCCTATTAAT TATAGCACrr 
11301 ATCATTTTTG CTTGCAGCAT CATCCTCACT ACCACACCAA CATGACCCCT ACCCCCCAAT CACCCGACCA GCAAAACTCG ATCTACTTCC CACCAACTCA 
11401 TCTCCATAAT GCATCACGCT CCTATATTAO ATCCCCGCTT ACCCCGGCCA ATATAGCAAC ACCAAAACTC GACCTATTTC CGACCAAGCG CAGTGCATAA 
11501 TGCTGCCCAG TCTTCCCAAA TAATCACTAT ATTAACCATT TATTCAGCCG ACCCCAAAAC TCAATCTATT TCTGAGGAAC CATCGTCCAT AATGCCATOC 
lifiOl AGCCTCrCCA TAACrmTA TTATTTCTTTTATTAATCAA CAAAATTTTC TTTTTAACAT TTC 
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S.A.AR86 

Amino Acid Sequence of the Nonsmicniral Polyprotein 



I MEKPWNVDV DPQSPFWQL QKSFPQFEW AQQVTPNDHA NARAFSHLAS KLIELEVFTT ATtLDICSA? ARRMFSEHQY HCVCPMRSPE OPORMMXYAS 

101 KLAEXAOCrr HKNLHEKOCD LRTVLDfTPDA CTPSLCFHND VTCNTRAEYS VMQDVYmAP CmYHQAMKG VRTLYWICFD TTQFMFSAMA GSYPAYNTNW 

201 ADEXVLEARN IGLCnXLSE CRTCKUIMR KKELKPGSRV YT5VCSTLYP EHRASUJSWH LPSVFHLXCIC QSYT CRCD TV VSCECYWKK lilSWilfO E 

301 TVOYAVTNNS EC1W3CVTD TVKCERVSFP VCTYIPATIC DQMTXSIMATO ISPDOAQKLL VCLNQRIVIN CKTNRNTNTM QNYILPIIAQ GFSKWAXEIX 

401 EOLDNEKMLG TRERXLTYGC LWAFRTKXVH SFYRPPCTQT IVKVPASFSA FPMSSVWTTS LPM5LRQXMK LALQnCXEEK LLQVPEELVM EAKAAFEDAQ 

SOI EESRAEXLRE ALPPLVADKG lEAAAEWCE VECLQAOTCA ALVETPROKV RIIFQANDRM IGQYIWSPt SVUCNAXLAP AHPLAOQVKI tTHSCRSGRY 

ttl AVEFYDAKVL MPACSAVPWP EFtALSESAT LVYNEREFVN RKLYHIAMKC PAKHTEEEQY KVTKAELAET EYVFDVOKXR CVKKEEASGL VLSGELINPP 

■JOl YHELALECUC TRPAVPYKVE TIGVICTTGS CKSAIDCSTV TAROLVTSCX KENCREIEAD VLRLRGMQrr SfdVDSVMLN CCHKAVEVLY VOEAFRCHAG 

801 ALLALIAIVR PRXKWLCCD PKQCCFFNMM QLXVHFHHPE KDICITTmC FISRRCTQPV TAIVSTlilYO CKMKTTNFaC XNKIDrTCA TXTXPOOIIL 

001 TCFROWVKQL QIDYFCHEVM TAAASQCLTR XCVYAVRQXV NENPLYATTS EHVNVLLmT EORLVWKTLQ GDPWKQLTN VVKGNFQATI EDWEAEHXCI 

1001 [AAINSPAPR TNPFSCKTNV CWAKALEPUL ATAGtVLTCC QW5ELFPQFA DDXmSAIYA LOVICKPFG MOLTSCLFSK QSIPLTYHPA DSAJIPVAHWD 

1101 NSPGTRKYGY DHAVAAELSR RFPVFQLAGX CTQLOUTTGR TRVtSAQHNL VPVNRNLPKA LVPEHXEKQP GPVEXFLSQF KHKSVLVISE XXIEAFHXRl 

1201 EWIAPIGIAG ADKNYNIAFO FPPQARYDLV FINIGTXYRN HHFQQCEOHA ATUCTLSRSA LNCLNPCtTTL WKSYGYADR NSEOWTALA RXFVRVSAAR 

1301 PECVSSMTEM YUFRQLONS RTRQFTPHHL NCVISSVYEO TRDCVGAAPS YRTKRENlAO CQEEAWNAA NPLCRPGEGV CRAIYKRWPN SFTOSATETG 

1401 TAKLTVCQGK XVIHAVCPDF RKHPEAEALX LLQNAYHAVA DLVNEKNIXS VAIPLLSTCl YAACKDRLEV SLNCLTTALD RTDADYTIYC LDXKWXERID 

1501 AVLQLXESVT ELKOEDMEIO OELVWIHPDS CUCGRXGFST TKCKLYSYFE CTXFHQAAKD MAEOCVUTN DQESNEQLCA YILGETMEAI REXCPVDHNP 

1601 5SSPPKTLFC LCMYAMTPER VHRUISNNVK EVTVCSSTPL PKYKDCNVQK VQCTXWLFN PffTPAFVPAR KYIEAPEQPA APFAQAEEAP GWATTTPPA 

1701 AOKTSLDVTD tSLDMEOSSE CSLFSSFSGS DNYRRQVWA DVHAVQEPAP VPPPRLXKMA RLAAARMQEE PTTPAmSA OESUILSFOa VStSFQSLFD 

1801 CEMARLAAAQ PPASTCTTDV PMSFCSF5DO EIEEtSRRVT ESEPVLFCSF EPCSVNSnS SRSAVSFPPR XQRRRRRSRR -rEYCLTCVCO YIFSTDrrOPO 

1901 HLQKXSVLQN QLTEPTLERN VLERIYAPVL OTSKEEQIJCL RYQMMFTEAN KSRYQSRXVE NQKArTTERL USCUILYNSA TOQPECYXrT YPXPSYSSSV 

2001 PANYSDPKFA VAVCNNYLHE NYPTVASYQI TDEYDAYLOM VOCTVACUTr ATFCPAKLRS YPKRHEYRAP NIRSAVPSAM Orm.QNVUA ATKRNCNVTQ 

210X MRELfTLDSA TFNVECFRKY ACNDEYWEBP ARXPIRnTE FVTAYVARLX CPKAAALFAK THNLVPLQEV PMDRFVMDMK RDVKVTPCTX tTTEERPXVQV 

2201 IQAAEPLATA YLCGIHRELV RRLTAVLLPK IKTIPDMSAE DFDAIIAEHF KQCDPVLETO lASFDKSQDD AMAtTCLMlL EDLGVDQPLL DLIECAFGEI 

2301 SSTHLPTGTR FXFGAMMKSG MFLTUWTV LNWIASRVL EERLKTSKCA ARCODMIH GWSDKEMAE RCATWLHMEV KIIDAVIGER PPYFCCCFIL 

2401 QOSVTSTACR VAOPLKRLFX LGKPLPADDE QOEDRRRALL DCTKAWFRVC ITOmAVAVA TRYEVDNTTP VLLALRTFAQ SXRAFQAIRG EDCHLYGGPK 



Amino Acid Sequence of the Structural Polyprotein 



1 MNRCFFNMLG RRPFPAPTAM WRPRRRRQAA PMPARNGLAS QIQQLTrAVS ALVICQATRP QTPRPRPPPR QKKQAPKQPP KPKKPKTQEK KXKQPAKPXP 

101 GXRQRMALXL EAORLFDVKN EOCDVIGHAL AMEGXVMXPL HVXCnOHPV LSKLKFTXSS AYOMEFAQLP VNMB5EAFTY TSEHPEGFYN WHHCAVQYSG 

201 GRFnPRGVC GRCDSGRPIM ONSGRWAIV LGCAOECnKT ALSWTWNSK CKIULIIPEC TEEWSAAPLV TAMCLLGNVS FPCNRPPrCY TKEPSRALOI 

301 LEENVNHEAY DTUKAILRC CSSGSSXRSV TDDFTLTSPY LGTCSYCHHT EPCFSPDCIE QVWDEADDNT IRiqTSAQFG YDQSGAASSN KYRYMSLEQD 

401 HTVXECTMDD IKSTSCFCR RLSYXCYFLL AKCPPCDSVT VSIASSNSAT SCTMARXOCP KFVGREXYDL PPVHGKXIPC TVYDRLXETT ACYrTMHRPC 

501 PHAYTSYLEE SSCXVYAKPP SGKNTTYECK CCOYKTGTVT TRTEfrCCTA DCQCVAYKSD QmcWVFNSPD SIRHAOHTAQ GKLHLPFKU PSTCMVPVAH 

60! APNWHGFKH ISLQLOTOHL TLLTTRRLCA NFEPTTEWII CfOVRNFTVO RDGLEYIWGN HEPVRVYAQE SAPCOPHGWP HEIVQHYYHR HPVYTfLAVA 

■701 SAAVAMMIGV TVAALCACKA RRECLTPYAL APNAVIPTSL ALLCCVRSAN AETFTFTMSY LWSNSQPFFW VQLOPLAAV WLMRCCSCC LPFLWAOAY 

Ml LAKVDAYEKA TTVPNVFQIP YXALVERAGY APLNLETTVM SSEVLPSTNQ EYTTCXFTTV VPSPKVRCCG SLECQPAAHA DY7CKVFCCV YPFMWCCAQC 

901 FCDSENSQMS EAYVBLSVOC ATOHAQAIKV KTAAMKYCLR TVYCNTTSFL DVYVNCVTPG TSXDUCVUG PISALFIPFO HKWINRGLV YNY PPPEYC A 

1001 MKPGAFGDIQ ATSLTSKOU ASTDIRLLKP SAXNVHVPYT QAASCFEMWK NKSGRPLQET APFGCXIAVN PLRAVDCSYC NIPISIDIFN AAF1RTSDAP 

not LVSTVKCDVS ECTYSAOFCC MATLQYVSDR ECQCPVHSHS STATLQESTV HVLEKGAVTV HFSTASPQAN F1VSLCCKKT TCNAECXPPA OfflVSmiKN 

1201 DQEFQAAISK TSWSWLFALF GCASSLUIG LMIFACSMML T5TRR 
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I NTTCNCCCCG TACTATACAC TATTCAATCA AACAGOCCSAC CAATTGCACT ACCATCACAA TCGACAACCC AGTACTTAAC CTAGACCTAC ACCCCCAOAG 
101 TCCGTTTCTC QTGCAACTCC AAAAOAOCTT CCCGCAATTT GAGCTAGTAO GACACCAOOT CACTCCAAAT CACCATCCTA ATGCCAOAGC ATTTTCCCAT 
201 CrCCCCAGTA AACTAATCGA GCTGGAGGTT CCTACCACAO CCACCATTTT CGACATACGC ACCCCACCCG CTCCTAOAAT GTTTTCCCAO CAGCAOTACC 
301 ATTOCCrrTTG CCCCATGCCTT ACTCCACAAC ACCCCGACCG CATCATCAAA TATGCCACCA AACTCCCCGA AAAAOCATCC AAGATTACCA ATAAGAACTT 
401 GCATGAGAAG ATCAAGCACC TCCCGACCCT ACTTOATACA CCGQATGCTG AAACCCCATC ACTCTCCTTC CACAACOATG TTACCTCCAA CACGCQTGOC 
sot OAGTACrCCC TCATCCACCA COTCTACATC AACCCTCCCG CAACTATXTA CCATCACCCT ATGAAAGCCC TCCCGACCCT CTACrOCATT CCCTTCCATA 
601 CCACCCACTT CATOTTCTCO CCTATOGCAO CTTCCTACCC TCCCTACAAC ACCAACTCGG CCGACGAAAA A G T CC T CO AA GCCCOTAACA TCCOA CfClU 
701 CACCACAAAC CTOACTCAAO GCAGGACAGC AAAUIILICU ATAATGACCA AGAACGAGTT CAACCCCCGG TCACGCCTTT ATTTCTCCCT TCCATCOACA 
801 CTTTACCCAG AACACACAGC CACCTTCCAG AGCTCGCATC TTCCATCGGT GTrCCACCTG AAACGAAAGC AGI C GIACAC TTCCCCCTCT GATACAQTGG 
901 TCACCrOCCA AGGCTACOTA GTOAAGAAAA TCACCATCAG TCCCGGGATC ACCGGACAAA CCGTCGGATA CCCGOTTACA AACAATAGCG ACGGCTTCTT 
1001 CCTATGCAAA CTTACCGATA CAGTAAAAGG AOAACGGCTA TCGTTOCCCG TOTGCACGTA TATCCCOGCC ACCATATOCG ATCAGATGAC CGCCATAATG 
1 tot GCCACCCATA TCrCACCTCA CCATCCACAA AAACTTCTCG TTGCCCTCAA CCACCGAATC CTCATTAACG GTAAGACTAA CAGGAACACC AATACCATGC 
1201 AAAATTACCTTCTGCCAATC ATTGCACAAC GGTTCAGCAA ATCGGCCAAG GAGCGCAAAG AACACCTTCA CAATCAAAAA ATGCTCGGTA CCACAGAGCG 
1301 CAAGCTTACA TATOOCTCCT TCTCOCCOTT TCCCACTAAG AAAGTGCACT CCTTCTATCC CCCACCTGCA ACCCAGACCA TCCyTAAAACT CCCAGCCTCT 
1401 TTTACCCCrr TCCCCATGTC ATCCCTATCG ACTAU.1L1 1 TCCCCATCTC GCTCAGGCAG AAGATAAAAT TCCCATTACA ACCAAAGAAG GAGGAAAAAC 
1501 TCCTGCAAGT CCCGGAGCAA TTAGTCATCG AGGCCAAGGC TCCTTTCCAG CATCCTCAGG AGGAATCCaG ACCGGAGAAG CTCCGACAAO CACTCCCACC 
1601 ATTACTGCCA GACAAACCTA TCOAGOCAGC CGCGGAACTT GTCTCCGAAG TGCACCCGCT CCACCCCGAC ATCCGAGCAG CACTCGTCGA AACCCCGCGC 
1701 GCrCATGTAA CGATAATACC ACAACCAAAT GACCCTTATGA TCGCACAGTA CATCCTTGTC TCCCCAACCr CrCTCCTGAA GAACGCTAAA CTCGCACCAG 
1801 CACACCCCCr AGCAGACCAG CTTAAGATCA TAACCCACTC CCGAAOATCA GGAAGGTATG CACTCCAACC ATACGACGCT AAACTACTCA TGCCAGCACG 
1901 AAGTGCCCTA CCATGCCCAG AATTCTTAGC ACTGAGFrGAG AGCGCCACCC TAGTGTACAA CCAAAGAGAG HIUIO AACC GCAAGCTGTA CCATATrOOC 
2001 ATGCACCCTC CCCCTAAGAA TaCaGAAGAG GACCAGTACA AGGTTACAAA GCCACACCTC CCAGaAACAG AGTACCTrClT TCACCTGGAC AAGAAGCCAT 
2101 GOCTTCAAGAA GCAAGAACCC TCAGGACTTG TCCTCTCGGC AGAACTGACC AACCCCCCCT ATCACGAACT AGCTCTTGAG CGACTGAAGA CTCCACCCGT 
2201 GCTCCCGTAC AAGGTTQAAA CAATAGGAGT GATACCOGCA CCAGGATCCG GCAACTCGGC TATCATCAAG TCAACTCTCA CGGCACGTCA TCTTGITACC 
2301 AGCGGAAAGA AAGAAAACTG CCGCGAAATT CAGGCCCATG TGCTACCGCT GAGGGGCATG CAGATCACCT CQAAGACAGT CCATrCCGTT A T GC TC A ACO 
:401 GATCCCGCAA acccctacaa ctcctctatc TTCACGAACC GTTCCCCTCC CACCCACCAG CACTACTTCC CTTCATTCCA ATCCTCAGAC cccctcataa 
2501 CCTACTCCTA TCCCGAGACC CTAAGCAATG CGGATTCTTC AACATGATGC AACTAAAGGT ATATTTCAAC CACCCCCAAA AACACATATO TACCAAGACA 
2601 TTCTACAAOT TTATCTCCCC ACC7TTGCACA CAGCCACTCA CCCCTATTGT ATCCACACTC CATTACGATG GAAAAATCAA AACCACAAAC CCCTGCAAGA 
27DI AGAACATCCA AATCGACATT ACACCGCCCA CGAAGCCGAA GCCACCGGAC ATCATCCTGA CATOCTTCCC CGGGTGGGTT AAGCAACTCC AAATCCACTA 
2801 TCCCCCACAT GACGTAATGA CAGCCCCCCC CTCACAACCC CTAACCACAA AACGACTTATA TGCCCTCCCG CAAAAACTCA ATGAAAACCC CCTGTACGCG 
2901 ATCACATCAG AGCATCTCAA CCTGCTGCTC ACCCGCACTG AGGACAGGCT ACTATCGAAA ACTTTACAGG GCGACCCATG GATTAACCAG CTCACTAACC 
3001 TACCAAAAGG AAATTTTCAA GCCACCATCG ACCACTGGCA AGCTCAACAC AAGGGAATAA TTCCTCCGAT AAACAGTCCC CCTCCCCGTA CCAATCCCTT 
3101 CAGCTCCAAC ACTAA LUl 1 1 CCTGCGCGAA ACGACTGGAA CCCATACTCG CCaCGGCCCG TATCCTACIT ACCGGTTGCC AGTGQAGCCA GCTGTTCCCA 
3201 CACTTTCCAG ATGACAAACC ACACTCCCCC ATCTACCCCC TCCACGTAAT CTCCATTAAG TTTTTCCCCA TCCACTTCAC AACCGGACTG ttttccaaac 
3301 AGAGCATCCC GTTAACCTAC CATCCTCCCG ATTCAGCCAG GCCAGTAGCT CATTGGGACA ACAGCCCAGG AACCCGCAAG TATGGCTACG ATCACGCCCT 
3401 TCCCCCCCAA CTCTCCCCTA GATTTCCGGT CTTCCAGCTA GCTCCCAAAG GCACACAGCT TGATTTCCAG ACCCGCAGAA CTAGAGTTAT CTCCGCACAO 
3S0I CATAACTTCG TCCCACTCAA CCCCAATCTC CCGCACGCCT TACTCCCCGA CCACAACGAG AAACAACCCC CCCCCGTCAA AAAATTCTTG AGCCAGTTCA 
3«l AACACCACrC CCTACTTCTG CTCTCAGAGG AAAAAATTGA AGCTCCCCAC AAGAGAATCG AATGGATCGC CCCGATTCGC ATaGCCGCCC CTCATAAGAA 
3701 CTACAACCrC CCTTTCCCCT TTCCCCCCCA GCCACGGTAC CACCTCGTCT TT ATCAATAT TCGAACTAAA TACAGAAACC ATCACTTTCA GCACTCCCAA 
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3801 CACCATGCCO CGACCTTCAA AACCCTCTCC CCTTCCCCCC TCAACTCCCT TAACCCCCGA GCCACCCTCG TCCTCAACTC CTACGGTTAC CCCGACCCCA 
3901 ATAGTOACaA COTAOTCACC CCTCITGCCA CAAAAl 1 101 CAGACTCTCT CCAGCCACGC CACACTCCCT CTCAACCAAT ACAOAAATGT ACCTCATCTT 

4001 cccacaacta cacaacaccc ccacacgaca attcaccccc catcatctca a t tc tct o at rrccn c co'l o TACGACGGTA CAACAGACCG ACrraCAGCC 

4101 CCACCCTCAT ACCCCACTAA AACCGACAAC ATTCCnSATT CTCAAGACGA ACCAC7ITGTC AATCCACCCA ATCCCCTCGC CACACCACCC GAAGCAGTCT 
4201 GCCCTCCCAT CTATAAACGT TGCCCCAACA OlllCACCCA TTCACCCACA GACACCCGCA CCGCAAAACT CACTCTCT G C CAAGCAAAGA AAGTGATCCA 
4301 CCCGGITCCC CCTGAIIIU. CGAAACACCC AGAGGCAGAA GCCCTGAAAT TGCTCCAAAA CCCCTACCAT CCACrCGCAG ACTTAGTAAA TGAACATAAT 
4401 ATCAACTCTC TCGCCATCCC ACTCCTATCT AGAGGCATIT ACGCACCCCG AAAAGACCGC CTTOAAOTAT CACTTAACrO CTTCACAACC CCGCTAGATA 
4501 GAACTGATCC GCACCTAACC ATCTACTGCC TGCATAAGAA GTCCAAGGAA AGAATCGACG CGCTGCTCCA ACrrAACGAC TCrCTAATAC ACCTCAACGA 
4«0I TCAGGATATG GAGATCGACG AGGAGTTAGT ATGGATCCAT CCCGACAGIT GCCTGAAGGG AACAAAGGCA TTCAGTACTA CAAAACGAAA GTTCTATTCC 
4701 TACfTTGAAG GCACCAAATT CCATCAAGCA GCAAAAOATA TGGCCGAGAT AAAGGTOCTG TTCC C AAATO ACCACGAAAG CAACGACCAA CTCTCr G CCT 
4101 ACATATTCCG CGAGACCATG OAAGCAATCC GCGAAAAATG CCCCGTCQAC CACAACCOCT CGTCTACCCC GCCAAAAACC CT CCCG T G CC TCFGCATCtTA 
4901 TCCCATGACO CCAGAAACGG TCCACACACT CAGAAGCAAC AACCTCAAAG AAGITACAGT ATCCTCCTCC ACCCCCCTTC CAAAGTACAA AATCAAOAAC 
3001 CTTCAGAACG TTCACTCCAC AAAACTAGTC CTOTTTAACC CCCATACCCC TGCATTCCTT CCCCCCCCTA ACTACATAGA AGCGCCAQAA CACCCTCCAC 
SlOl CTCCCCCTGC ACAGGCCCAG GAGCCCCCCG AACTTCCAGC AACACCAACA CCACCTGCAG CTGATAACAC CTCGCTTCAT GTCACCGACA TCTCACTCGA 
S20I CATGGAAGAC ACTAGCGAAG GCrCACTCTT TTCGACCTTT AGCCOATCGG ACAACTCTAT TACTAGTATG GACAGTTCCT CCTCACCACC TAGTTCACTA 
S30I CACATACTAG ACCGAAGGCA GGTGGTCGTG GCTOACGTCC ATC CC PT C CA AGACCCTCCC CCTCTTCCAC CCCCAACCCT AAAOAAGATO GCCCGOCTCG 
5401 CAGCCGCAAC AATCCACGAA CAGCCAACTC CACCCGCAAG CACCACCTCT CCCGACCACT CCCTTCACCT TTCTTTTCGT CCCCTATCCA TCTCCTTCGC 
5501 ATCCCmrC GACGCAGAGA TCCCCCCCTT CCCAGCCGCA CAACCCCCGC CAACTACATC CCCTACGCAT CTCCCTATGT CTTTCCCATC GHU CCCAC 
S60I CGAGAGATTO AGGACCrOAG CCCCAGAGTA ACCGAGTCTG ACC CCG TCCl OIIIGGCTCA TTTCAACCGG GCCAAGTCAA CTCAATTATA TCCJTCCCGAT 
S701 CAOIIGIATC TTXTCCACCA CGCAAGCAGA GACGTACACC CAGCAGCAGG AGCACCGAAT ACTCACTAAC CCCCGTACCT CGGTACATAT TTTCQACCGA 
5801 CACACGCCCT CGGCACTTCC AAATCGACTC CCTTCTGCAC AATCACCTTA CAGAACCCAC CTTCGAGCCC AATGTTCTGG AAAGAATCTA CGCXCCGCTG 
5901 CTCCACACGT CGAAACAGGA ACAGCTCAAA CTCAGGTACC AGATCATGCC CACCCAAGCC AACAAAAGCA CGTACCACTC TAGAAAAGTA GAAAATCAGA 
6001 AAGCCATAAC CACTCAGCCA CTGCTXTCAG GCCTACCACT OTATAACTCT GCCACACATC AGCCACAATO CTATAAGATC ACCTACCCCA AACCATCGTA 
6X01 TTCCACCAGT GTACCCCCGA ACTACTCT C A CCCAAACTTT GCTGTAGCrG TTTCCAACAA CTATCTCCAT CAGAATTACC CCACCGTAGC ATCTTATCAG 
6301 ATCACCGACC AGTACGATCC TTACTTGCAT ATCGTACACG GGACAGTCGC TTCCCTAGAT ACTCCAACTT TITCCCCCGC CAAGCTTACA AGTTACCCGA 
6301 AAAGACACGA GTATAGAGCC CCAAACACTC GCAGTGCGCT TCCATCAGCG ATGCAOAACA CCTrTCCAAAA CGTGCTCATT GCCCCGACTA AAAGAAACTG 
6401 CAACGTCACA CAAATGCGTG AATTCCCAAC ACTGCACTCA GCGACATTCA ACGTTGAATC CTTTCGAAAA TATGCATCTA ATCACGAGTA TrCGGAGGAG 
6S01 nrCCCCGAA AGCCAATTAG GATCACTACT GAGTTCCITA CCGCATACCT CGCCACACTG AAAGCCCCTA AGGCCGCCCC ACTCTTCGCA AAGACCCATA 
6601 ATTTCCTCCC ATTGCAACAA GTCCCTATCG ATAGGTTCCT CATCGACATG AAAAGAGACG TGAAAGTTAC ACCTGCCACG AAACACACAO AACAAAGACC 
6701 CAAAGTACAA GTCCTACAAG CCCCAGAACC CCTCGCGACC GCTTACCTCT CCGCCATCCA CCCGGAGTTA CTGCGCACCC TTACAGCCGT CITCCTACCC 
6801 AACATTCACA CGCmTTGA CATGTCGGCG GAGGACTTTG ATGCAATCAT AGCAGAACAC TTCAAGCAAG CTGACCCGGT ACTGGACACG GATATCGCCT 
6901 CCTTCGACAA AACCCAACAC CACGCTATGG CGTTAACTGO CCTGATGATC TTGGAAGACC TCGCTTCTCCA CCAACCACTA CTCCACTTGA TCGAGTCCCC 
7001 CTTTCGAGAA ATATCATCCA CCCATCTCCC CACGCCTACC COIM CAAAT TCGCCGCCAT GATCaAATCC GGAATGTTCC TCACCCTCTT TCTCAACACA 
7101 CTTCTGAATC TCGTTATCGC CAGCAGAGTA TTGGACGAGC GGCTTAAAAC GTCCAAATCT CCAGCATTTA TCGGCGACGA CaACATCATA CACGGAGTAC 
7301 TATCTCACAA AGAAATGCCT GACAGGTGTO CCACCTGCCT CAACATCGAC CTTAAGATCA TTGACCCAGT CATCCGCCAC AGACCCCCTT ACrrCTGCCG 
7301 TGOATTCATCTTGCAAGATr CGCTTACCTC CACAGCCTCT CCCGTCGCCO ACCCCTTGAA AACCCTCnT AACTTCGCTA AACCCCTCCC AGCCCACCAC 
7401 GACCAACACC AACACAGAAG ACGCCCTCTG CTAGATGAAA CAAAGGCGTG GTTTAGAGTA GGTATAACAG ACACCTTACC ACTCCCCCTG CCAACTCGCT 
7501 ATGAGCTAGA CAACATCACA CCTCTCCTCC TGGCATTOAG AA CHMO CC CACAGCAAAA GAGCATTTCA ACCCATCAGA GGGGAAATAA AGCATCTCTA 
7601 CCCTGCrr C CT AAATAGTCAG CATAGCACAT TTCATCTGAC TAATACCACA ACACCACCAC CATOAATACA GGATTCTTTA ACATCCTCGC CCGCCCCCCC 
not TTCCCCCCCC CCACTGCCAT CTCGACGCCC CGGAGAAGCA CGCAGGCGGC CCCCATCCCT GCCCGCAATC GGCTGGCTTC CCAAATCCAC CAACTCACCA 
7801 CAGCCGTCAG TGCCCTACTC ATTCCACAGG CAACTAGACCTCAAACCCCA CCCCCACGCC CGCCCCCGCC CCAGAAGAAG CAGGCCCCAA AGCAACCACC 
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Tnt CAACCCCAAG AAACCAAAAA CACACGACAA GAACAAGAAG CAACCTGCAA AACCCAAACC CCGAAAGACA CAACCTATCC CACTCAACTT OCACGCCCAC 
1001 ACACTCTTCC ACCTCAAAAA TGAGGACGGA GATCTCATCC CCCACGCACT CCCCATCGAA CGAAACCTAA TGAAACCACT CCACCTCAAA GCAACTATTC 
1101 ACCACCCrCT CCTATCAAAG CTCAAAITCA CCAAGTCCTC AOCATACOAC ATCGAGTTCG CACACTTCCC GGTCAACATG AOAACrTOAGG CCTTCACCTA 
not CACCACCGAA CACCCTOAAG UUIUIA CAA CTCGCACCAC CGACCCCTCC ACTATAGTCG AGGTACATTT ACC A TC CC C C GCCGACTAGG AGGCAGAGQA 
S30t GACAGTGC7TC GTCCGATTAT CGATAACTCA GGC COGO TI U TCCCCATACT CCTCCGACGG CCTCATGAGC GAACAAOAAC TCCCCITTCG CTCOTCACCT 
S401 GCAATAGCAA ACGGAAGACA ATCAAGACAA CCCCCCAACC GACAGAAGAC TCCTCTCCAG CACCACTGGT CACGGCCATG TCCTTCCITG GAAACOTCAO 
tSOl CTTCCCATCC AATCCCCCOC CCACATCCTA CACCCCCGAA CCATCCACAG CTCTTGACAT CCTTGAAGAG AACGTGAACC ACOAGGCCTA CGACACCCTO 
1601 CTCAACGCCA TATTOCGGTG COCATCGTC C CGCAGAAGCA AAACAAGCCT CACTCACGAC TTTACCTTCA CCACCCCCTA CTTCGCCACA T Oc r c o TA cr 
fTOl CTCACCATAC TGAACCCTGC TTTACCCCGA TTAAGATCCA CCACGTCrCG GATCAAGCCG ACCACAACAC CATACGCATA CACACTTCCO CCCA CiHIOO 
tni ATACCACCAA ACCGGACCAO CAACCTCAAA TAAGTACCCC TACATGTCGC TCGAGCAGGA TCATACCGTC AAAOAAGGCA CTATGGATGA CATCAACATC 

not ACCAGcrcAC oaccgtctag aaggcttacc tacaaagoat Acnrcrccr cgccaactct cctccagggg acagcgfaac corrAarATA gcgactacca 

9001 ACTCAGCAAC GTCATGCACA ATCCCCCGCA AGATAAAACC AAAATTCGTG GGACCCGAAA AATATCACCT ACCTCCCCTT CACGOTAAOA AC AMCCl lli 
9101 CACAGTGTAC GAOCOTCTGA AAQAAACAAC CCCCCGCrAC ATCACTATGC ACAGCCCGGG ACCGCACCCC TATACCT CC T ATCTCGACGA ATCATCAOGG 
9201 AAAGTCTACC CGAAGCCACC ATCCCGAAAC AACATTACCT ACGAGTCCAA GTCCCGCGAT TACAAGACCG GTACCGTTAC CACCCCTACC CAAATCACGG 
9301 CCTCCACCCC CATCAAGCAC TGCCrrCCCCT ATAAGAGCGA CCAAACGAAG TGCCTCTTCA ATTCCCCGGA CITCATCAGA CATCCCCACC ACACOGCCCA 
9401 ACGGAAATTG CATTTACCTT TCAAGCTCAT CCCGAGTACC TCCATCCTCC CTCTTCCCCA CCCCCCGAAC GTAGTACACO GCTTTAAACA CATCACCCTC 
9S01 CAATTAGACA CAGACCACCT OACATTGCTC ACCACCAGGA GACFAGGCOC AAATCCCGAA CCAACTACTC AATGGATCAT CCGAAAGACQ OrrACAAACT 
9«l TCACCCTCCA CCCAGATCGC CTGCAATACA TATCGCGCAA TCACCAACCC CTAACCGTCT ATCCCCAACA GTCTGCACCA CGAGACCCTC ACGGATCGCC 
9701 ACACCAAATA GTACAGCATT ACTACCATCC CCATCCTCTC TACACCATCT TACCCGFCGC ATCACCTCCT CTCGCGATGA TCATTOCCGT AACTOTTCCA 
9101 CCATTATCTC CCrGTAAAGC CCC C C G T CA G TCCCTCACCC CATATGCCCT GCCCCCAAAT OCCCTCATTC CAACTTCGCr GGCA LIIIIU TGCTCTGITA 
9901 GGTCGGCTAA TGCTCAAACA TTCACCCAGA CCATCACTTA CCTATCGTCG AACAGCCACC CATTCTTCTG CGTCCACCTC TGTATACCCC TCC CCOC rOT 
lOOOt CATCCTTCTA ATGCCCTCTT CCTCATCCTC CCTCCLl 1 1 1 TTACTCCTTC CCCCCCCCTA CCTCGCGAAG CTACACCCCT ACCAACATCC CACCACTCTT 
lOlOl CCAAATCTCC CACAGATACC GTATAAGGCA CITCTTOAAA GGGCACGCTA CGCCCCCCTC AATTFCGAGA TTACTOrCAT CrCCTCCGAG GlllibCCll 
10201 CCACCAACCA ACAOTACATC ACCTCCAAAT TCACCACTCT GOT C C CCT CC CCTAAAGTCA AATCCTCCGC CTCCTTCGAA TCTCAGCCCG CCCCTCACOC 
10301 AGACTaTACC TGCAAGCTCT TTCOAGGGCr CTAC CCCl IC AlOtGGC CAC CAGCACAATC TTTTrCCCAC ACTOAGAACA GCCACATCAC TCACGCOTAC 
I040I CTCCAATTCT CACCAGATTG CCCGACTGAC CACCCCCAGG CGATTAAGCT GCATACTCCC CCCATGAAAC TAGGACTACC TATAGTGTAC GGGAACACTA 
lOSOl CCAOl MCCT ACATGTCTAC GTCAACCGAC TCACACCAGG AACGTCTAAA GACCTGAAAG TCATAGCTGG ACCAATTTCA GCATCCTTTA CACCATTCCA 
10601 TCACAACCTC CTTATCCATC CCCCCCTCCT CTACAACTAT GACrrCCCCG AATACCGAGC CATGAAACCA CGAC COil iO GAGACATTCA AGCTACCTCC 
10701 TTCACTAGCA AAGATCTCAT CGCCAGCACA GACATTAGAC TACFCAACCC TTCCGCCAAG AACCTCCATO TCCCOTACAC GCAGGCCGCA TCTOGATTCG 
10101 AGATCTCCAA AAACAACTCA GCCCCCCCAC TGCACGAAAC CCCCCCTTTC GCCTCCAACA TTCCACTCAA TCCCCTTCCA GCCCTGCACT GCTCATACGG 
10901 GAACATTCCC ATCTCTATCG ACATCCCGAA CGCTGCCnT ATCAGGACAT CACATGCACC ACT G CT CICA ACACTCAAAT CTGATCTCAG TGAGTCCACT 
11001 TACTCAGCGG ACrrCCCCGO GATGGCTACC CTGCACTATG TATCCGACCC CGAAGGACAA TCC C CTCT A C ATTCGCATTC GACCACAGCA ACCCrCCAAG 
1 1 101 AGTCCACACT TC A T G T CC T G GAGaaAGGaG CGCTOACACT ACACTTCAGC ACCCCGAGCC CACaGGCCAA CnTATTCTA T CG CT G TCTO GTAAGAAOAC 
11201 AACATCCAAT CCACAATCCA AACCACCACC TGACC^TATC GTCACCACCC CCCACAAAAA TCACCAAGAA TTCCAACCCO CCATCTCAAA AACTTCATGG 
1 1301 AGTTGGCrCT TTCCCLIl II CGGCCGCCCC TCGTCGCTAT TAATTATAGG ACTTATGATT 1 1 ILL II O CA GCATCATCCT GACTACCACA CGAAGATGAC 
11401 CCCTACCCCC CAATCACOCC ACCAGCAAAA CTCCATCTAC TTCCCAGCAA CTGATCfCCA TAATGCATCA GGCTCGTATA TTAGATCCCC GCTTACCCCG 
11501 CCCAATATAC CAACACCAAA ACTCCACCTA TTrCCGAGGA AGCCCACTGC ATAATCCTCC GCAGTGTTGC CAAATAATCA CTATATTAAC CATrTATTTA 
1 1601 CCGCACCCCA AAACTCAATC TATITCTCAC GAAGCATCCT CCATAATGCC ATCCACCCTC TCCATAACTT TTTATTATTT LH IIA TTAA TCAACAAAAT 
UTOl TTTCrmTAACATITN 
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Amino Acid Sequence of the Nonstructural Polyprotein 



I MEKPWNVDV DPQSPPWQL QKSF PQFEW AQQV TTWDHA NARAFSHIAS KUELEVFTT ATILDICSAP ARRMFSEHQY HCVCPMRSPE DPMMMXYAS 

101 KLAEKACXrr NKNLKEXOCD LRTVLDTPDA ETFSLCFHND VTWTRAEYS VMQDVYINAP CTIYKQAMKG VRTLYWICFD TTQFMFSAMA CSYPAYimiW 

301 ADEKVLEARN tCLCSTXLSE CRTGXLStMR KKEUCPGSRV YFSVCCTLYP EHRASLQSWH LPSVFHUCCX QSYTCRCDTV VSCECYWKK I T lSKii T Ufc 

30t TVCYAVTNNS ECFLLCKVTD TVKGERVSF? VCTTlfATIC DQMTCIMATO BPDDAQKLL VGLNQRrVIN GKTNRNrKTM QKYLLPUAQ GFSXWAXERX 

401 EOLDNEKMUS TRERKLTYCC LWAFRTKKVH SFYRPK7TXJT IV KVPA SFSA FPMSSVWTTS LPMSLRQKOC LALQPKXEEK LLQVPEELVM EAKAAFEDAQ 

sot EESRAEKLRE ALPPLVADKG lEAAAEWCE VEGLQAOICA ALVET7RGHV RUPQANDRM IGQYIWSPT SVtXKAKLAP AHPUOJQVKI mtSGRSCRY 

601 AVEPYDAKVL MPAGSAVPWP EFLALSESAT LVYNEREPVN RXLYHIAMHQ PAXNTCEEQY KVTKAELAET EYVFDVDKKR CVXKEEASCL VLSOTLITIFP 

701 YHELALECLK TRPWPYKVB TtGVIGAFGS CKSAIOCJrV TAROLVROC XENCREIQAD VLRLRGMQIT SKTVOSVMLN GCRXAVEVLY VD6AFACHAG 

tOl ALLAUAIVR PRHKWLCCD PKQCGFFNMM QUCVYFNHPE KDICTinTYX HSRRCTQPV TAIVSXUiYD GKMKTTNPaC KNIEDfrCA TXFKFaOlIL 

901 TCFRGWVKQL QIDYPCHEVM TAAASQGLTR KCVYAVRQKV NENPLYATTS EHVNVLLTRT EORLVWICTLQ GDPWIKQLTN VPKGNFQATl EDWEAEKKCI 

1001 lAAINSPAPR TNPPSCICTNV CWAXRLEPIL ATAGIVLTCC QW5ELFPQFA DOXPHSAIYA LOYICIKFFG MDLTSCLFSK QSIPLTYHPA DSARPVAHWD 

llOl NSPGTRXYGY DHAVAAELSR RFPVFQLACK GTOLOLCTTGR TRVISAQKNL VPVNRNLPHA tVPEKKEXQP GPVXKFLSQF KKKSVLWSB EKIEAFHKRI 

1201 EWUPIGIAO ADKNYNLAFG FPFQARYDLV FtNICTKYRN HHFQQCEDHA ATUCTLSRSA LNO-NPCGTL WKSYGYAOR NSEDWTALA RKFVRVSAAR 

1301 PECVSSNTEM YLIFRQLONS RTRQFTPKHL NCVISSVYEC TRDGVGAAPS YRTKRENIAD CQEEAWNAA NPLGRPCEGV CRAIYXRWPN SFTDSATETC 

1401 TAXLTVCQGK KVIHAVGPDF RKHPEAEAUC LLQNAYHAVA OLVNEHNOCS VAIPLLSTCI YAAGKDRLBV SLNCLTTALO RTDAOVTIYC LCKKWKERID 

ISO! AVLQUCESVI EUCOEDMEIO DELVWIHPDS CtXGRKCFST TXCKLYSYFB CTXFHQAAKO MAEIXVLFPN DQESNEQLCA YILGCTMEAI REKCPVDHNP 

1601 SSSPPXTIJC LCMYAMTPER VH RLRSN WVK EVTVCSSTPL FKYKIK NVQK VQCnCWLFN PHTPAFVPAR XYIEAPEQPA APPAQAEEAP EVAATFTPPA 

1701 AOKTSLOVTD ISLOMEDSSB CSLFSSFSCS DNSIT SMDS W SSGF5SLEIV DRRQVWAOV HAVQEPAPVp PPRLXKMARL AAARMQEEPT PPASTSSAOE 

ISOt SUILSFCGVS MSFCSLFDGB MGALAAAQPP ASTCFTDVPM SFCSFSDGEI EEL5RRVTCS EPVLFCSFEP CEVNSIBSR SWSFPPRXQ RRRRRSRRTE 

1901 Y 



Amino Acid Sequence of the Structural Polyprotein 



1 MNRCFFNMLG RRPFPAPTAM WRPRRRRQAA PMPARNCLAS QIQQLTrAVS ALVICQATRP QTPRPRPPPR QKKQAPKQPP KPXXPICTqEK XXKQPAXPKP 

101 GKRORMAUCL EADRLFDVKN EDCOVIGHAL AMECKVMKPL HVXGTIDHPV LSKLXFTXSS AYDMEFAQLP VNMR5EAFTY TSEHPECFYN WHHGAVQY5G 

201 CRFTIPRGVG GRGOSGRPIM DKSCRWAIV LCGAOEGTRT ALSWTWNSK GKTnCTTPEO TEEWSAAPLV TAMCLLCrfVS FPCNRPPTCY TREPSRALOt 

301 LEENVNHEAY I7TLLNA1LRC GSSGRSKR5V TDDFTtTSPY LCrCSYCMHT EPCFSPIKIE QVWOEAOONT IRIQfTSAQFO YDQSGAASSN KYRYMSLEQD 

401 KTVKEGTMDD DCOTSGPCR RLSYKGYFLL AKCPFCDSVT VSIAS5NSAT SCTMARXOCP KFVGREKYDL PPVHGXKtPC TVYORUCCTT AGYHMHRPG 

501 PKAYTSYLEE SSGXVYAKPP SCKNTTfECK CCD YKTO IVT TRTETTCCTA OCQCVAYKSD QTXWVFNSPD URHAOHTAQ CKUILPFKU PSTCMVPVAK 

601 APNWHGF1CH ISLQUTTDHL TLLTTRRLCA NPEPTTEWn GKTVRNFTVD ROGLEYIWCN HEPVRVYAQB SAPCCPHGWP HEIVQHYYHR HPVYTOAVA 

701 SAAVAMMICV TVAALCACKA RRECLTPYAL APNAVHTSL ALLCCVRSAN AETFTETMSY LWSNSQPFFW VQLCIPLAAV IVLMRCCSCC LPFLWAGAY 

801 lAKVSAYEHA TTVPMVFCIP YKALVERACY APUfLEOVM SSEVLPCTNQ EYrTOCFTTV VPSPKVKCCC SLECQPAAHA DYTOCVFGGV YPFMWGOAQC 

901 FCOSENSQMS EAYVELSADC ATDHAQAIKV KTAAMKYGLR IVYCKTTSFL DVYVNGVTPG TSXOUCVIAC PISASFTPFD HKWIHRCLV YNYOFPSYQA 

toot MKPGAFGDtQ ATSLTSKDU A^rRLLKP SAKNYHVPTT QAASGFEMWK NNSGRPLQET APFGCKIAVN PLRAVOCSYG NIPtSIOfPN AAFUmOAP 

1101 LVSTVKCDVS ECTYSAOFCO MATLQYVSDR EGQCPVHSHS STATLQESTV HVLEICGAVTV HPSTASPQAN FIVSLCCKICT TCNAEdCPPA OHIVSTPHKN 

1201 OqEFQAAXSK TSWSWLFAUF GGASSLUIG LMIFACSMML TSTRR 
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Nucleotide Sequence of S55 

I ATTCOCGCCC TACTTACACAC TATTCAATCA AaCACCCCAC CAATTOCaCT ACCATCACAA TCCACAAOCC ACTaCTTAAC CTAGACCTaG ACCCICACAC TCCCTrWlC CTQCAACRSC 
121 AAAAGAGCTT CCCCCAATTT CACCTACTAG CACAC C ACCT CACTCCAAAT GACCATDCTA ATOCCACACC ATTTTCOCAT CTCGCCACTA AACTOATCCA GCTOCAGCXT CCTACCACAG 

241 ccACGATTTr ccacataccc agcccacccc ctcctagaat crrrrccCAC caccactacc ATTGCCTTTC CCCCATCCGT ACICCACAAC acccgcaccg CATCATCAAA TATCCCACCA 

36t AACrOGCCSGA AAAACCATOT AAGATTACAA ACAaCAACIT CCATCaGAAG ATCAACGACC TCCGGACCCT ACTTCATACA CCCOATGCTC AAaCGCCATC ACIUTCCTTC CACAACCATC 
ai TTACCroCAA CACGCCTCCC CACTACTCCC TCATCCACGA CCTCTACATC AACGCTCCCC CAACTATTTA CCACCACCCT ATGAAACGCC TCCGGaCCCT GTACTCCATT GCCTTCGACA 
«t CCACCCACTT CATCTTCTCC CCTATGCCAC CTTCCTACCC TCCATACAAC ACCAACTOCG CCCACGAAAA ACTCCTTCAA CCCCCTAACA TCOCACTCn; CACCACAAAG ctoactqaag 
711 GCAGGACACG AAACTICTCG ATAATCACCA ACAACCACTT CAACCCCCCG TCACCCCTTT ATTTCPCCCT TCGATCCACA CTTTACCCaO AACACAGACC CACCTTCCAG ACCTOGCATC 
Ut TTCCATCGCT CTTCCACrrC AAACCAAACC ACTCCTACAC TTCCCCCICT GATACACTCC TCACCTCCGA ACCCTACGTA CTCAACAAAA TCACCATCAC TCCCOOCA-rc ACCCCAGAAA 
961 CCCTCGGATA CGCOCTTTACA AACAATACCC ACCOCULII CCTATCCAAA CtTACCGATA CACTAAAAGG ACAACCGGTA TCGTrCCCCO TCTCCACGTA TATCCCGCCC ACCATAIOCC 
101 1 ATCAGATCAC CCGCATAATC CCCACCGaTA TCTCACCTCA CCATCCACAA AAACrrCTCC TTCCCCTCAA CCAGCCAATC CTCATTAACC GTAAGACTAA CACCAACACC AATACCAICC 
1201 AAAATTACCT TCTCCCAATC ATTCCACAAG CCTTCACCAA ATCGOCCAAG GAG C OCAAAO AACATCTTCA CAATCAAAAA ATCCTGCCCA CCACACAGCG CAAOCTTACA TATOGCTOCT 
tJ2t TCTOGCCCTT TCOCACTAAC AAACTCCACT CCTTCTATCC CCCACCXCGA ACCCACACCA TCtTTAAAACT CCCaGCCTCT TTTAGCGCTT TCCCCATCIC ATCCCTATOa ACTACCTCTT 
[441 TGCCCATCTC CCTQACGCAO AAGATCAAAT TSCCATTACA ACCAAACAAC CAOC A AAAA C TCCrCCAACr CCCCCACCAA ITACTTATCG ACGCXAACGC TSCTTICQAO GATOCTCACa 
IMI AGGAATCCAG ACCGCAOAA C CTCCCaCAAQ CACTCCCACC ATTACTCGCA GACAAACGTA TCGAGOCAGC tocggaactt OTCTOCGAAG TGOACGGGCT CCAGGCGCAC ^rnWAQgAO 
IMt CAfrrcCTCGA AACCCCGCCC OCTCATCTAA OGATAATACC TCA A CCAAAT CaCCCTATBA TCCOACACTA TATCCTTCTC TCGfcCCATCT CTBTUCTOAA GAACOCTAAA CTCOCACCAC 
tni CACACCCCCT AGCAGACCAG CTTAAGATCA TAACGCACTC CCCAACATCA GGAACCTATC CACTCGAACC ATACGACCCT AAACTACTCA TGCCAGCAGG AACTOCCGTA CCATOCCCAG 
19X1 AATTCTTACC ACTQACrCAG AC C GCCA COC TICTGTACAA CCAAAGAGAG UTOTGAACC CCAAGCTCTA CCATATTCCC ATCCACGCTC CCGCTAAGAA TACAGAAGAG GAOCACTACA 
204t AGCTTACAAA OGCAOACCTC GCACA A ACAG ACTACCTCTT TSACCTQCAC AAGAAGC6AT GCCTTAAGAA CGAAGAACCC TCaOGACTTO TCCTTTCGCG AGAACTCACC AACCCCCCCT 
2Ut ATCACGAACT ACL 1 LI lUAC CCACrSAACA CTCCACCCCC GCTCCCCTAC AACCTTGAAA CAATAGGACT CATACGCACA CCACCATCCG CCAAGTCAGC TATCATCAAO TCAACTCTCA 

2211 ccGCACCTCA TCTTCTTACC accgcaaaga aacaaaactc ccccgaaatt gacccccacg toctacccct cacccccatx; CACATCACCT CCaaCaCACT OCATTCCCIT ATCCTCAACG 
2401 CATGCCACAA ACCCGTACAA ctcctcttatc ttcaccaaoc cttcccctcc cacccagcaC cactacttcc cttgattcca atcctcacac cccctaagaa gctactacta tccggacacc 

ISir CTAACCAATC CGCAliLlIL AACaTCATGC AACTAAACCT ACATTTCAAC CACCCTCAAA AACACATATC TACCAACACA TTCTACAACT TTATCTCCCC ACCTTCCACA cacccactca 
2«41 CGGCTATTCT ATCOACACTG CATTACCATG GAAAAATGaA AAC^aCAAAC ^CTCCAAGA aCAACATCCA AATCCACATT acagcgccca CCAACCCGAA cccacgggac atcatcctsa 

2761 catctttccc CCCCTCCOTT AAGCAACTOC aaatccacta tcccggacat cacctaatga caoccccocc ctcacaaggc ctaaccacaa aaggactata tcccctccoc caaaaactca 

2ttt ATCAAAACCC CCTCTACOCG ATCACATCAG AGCATCTCAA CCTCTTCCTC ACCCGCACTC AOGACAGGCT ACTaTGGAAA ACnTACAOC OCGACCCATC CATTAAGCAG CICACTAACG 
3001 TACCTAAAGG AAATTTTCAC CCCACCATCG AGGACTOGCA ACCTQAACAC AAGGCAATAA TTBCTCCCAT AAACACTCCC GCTCCCCCTA CCAATCCCTT CaGCTOCAAG ACTAACGTIT 
3131 OCTGGCCGAA ACCACTC6AA CCGATACTGQ CCACGCCCOG TATCOTACTT ACCGCTTCCC AGTGGAGCGA GCTCTTCCCA CAUI1IU.CQ ATSACAAACC ACACT CCOCC ATCTACGCCT 
3241 TAGACCTAAT TrCCATTAAG 1111 ILCGCA -TOCACTTCAC AACCOCGCTC TXTTCCAAAC ACACCATCCC CTTAACCTAC CATCCTCCCC ACTCAGCCAG GCCACTAGCT CATTSGGACA 
1361 ACaCCCCAGG aacacocaac tatogotacg atcaccccct TC CCCCC CAA CTCTCCCCTA GATTTCCOCT CTTCCAGCTA GCTCGGAAAG CCACACAGCT tc auill ag accgocagaa 
sat CrACACITAT CTCTCCACAG CATAACTTCG TCCCACTGAA ccgcaatcic cctcacgccttaciccccca ocacaagoag AAACAACCCC GCCCOGTCCA AAAATTCTTO AGOCAGTICA 
3«l AACACCACTC CCTALIILIU ATCTCAGACA AAAAAATTCA AGCICCCCAC AAGAGAATCG AATCGATCGC CCCGATTCOC ATAGCCCGCC CAGATAAGAA CTACAACCTO OCTTTCOOCT 
37X1 TICCOCCGCA GCCACGCTAC GaCCTOCTCT TCATCAATAT TCGAACTAAA TaCAGAAACC ATCaCTTTCA ACACTQCCAA GACCACGCCO CCACCTTCAA AaCCCTTXCO ccttcocccc 
3641 TSAACTGCCT TAACCCCCCA CCCACCCTCC TCCTCAACTC CTACCCTTAC GCCCACCCCA ATACTCAOCA CCTACTCACC GCTCTTCCCA CAAAATTrCT CACACTGTUr CCACCCACOC 
3961 CACACTGCCr CTCAAGCAAT ACaCaAATCT ACCTCATTTT CCCACAACTA CaCAACAGCC OCACACGaCA aTTCACCCCC catcatttca attctctcat ttcctccctg taccagccta 
4061 CAACAGACCC ACTTCCACCC OCaCCCTCCT ACCCTACTAA AACCCaCAAC ATTCCTCATT CTCaaGACCA ACCACTTCTC AATCCAGCCA atccactogg cacaccacca GAACCAcrrcr 
4101 CCCCTCCCAT CTATAAACCT TCCCCCAACA CTTTCACCCA TTCACCCACA GAGaCaCCTA CCCCAAAACT CaCTCTUTCC CAACCAAACA AACTCATCCA CCCOCTTCCC CCTCATTTCC 
•nil CCAAACACCC ACACCCACAA CCCCTCAAAT TCCrOCAAAA cccctaccat CCACTCOCAG ACTTACTAAA TCAACATAAT ATCAACT C T C TCGCCATCCC ACTOCTATCT acacgcatit 
*44l ACOCAGCCCC AAAACACCCC cttcagctat cacttaactg cttcacaacc gcoctagaca GAACTCATCC CCACCTAACC ATCTACTCCC tcgataacaa ctocaa c caa acaatcgacg 
4361 CCCTGCILCA ACTTAAGGaG tctctaacts acctgaagga tgaggatato gagatcgacg acgacttact atggatccat cccgacagtt gccttmacgc aacaaaggca ttcactacta 

^1 CAAAAGCAAA CTTUTATTCC TACRTCAAG CCACCAAATT CCATCAACCA CCAAAACATA TCGCGGAGAT AAAGGTCCTC TTCCCAAATG ACCACCAAAG CAACCAACA A ClGiUlLLLI 
4101 ACATATTCGG CCAGACCATC CAACCAA T CC CCGAAAAATG CCCOCTCGAC CACAACCCCT CCTCTACCCC GCCAAAAACC CTCCCCTCCC TCTCTATCTA TCCCATCACG CCAGAAAGGG 
«921 TCCACACACT CACAAOCAAT AACCTCAAAC AACTTACACT A I OL I LL ILL ACCCCCCTTC CAAACTACAA AATCAAGAAT CTTCACAACC TTCACIGCAC AAAACTAGTC CTCTTTAACC 
9041 cccatacccc CGCATTCCTT CCCGCCCCTA ACTACATAGa agcaccagaa CA GC CT C CAG CTCCCCCTGC ACAGOCCGAG gacccccccg CaCTTCTACC GACACCAACA ccacctccag 
SKI CTCATaaCAC CTCCCTTCAT GTCACCGACA TCTCACTCGA CATCCAACAC 'aCTAGCCAAC GCTCALILir TTCGAGCTTT AGCGGATCOC ACAACTACCC AAGCCAGGTO CTCCTCOCTC 
32B1 ACCTCCATCC CCTCCAACAC CC I GCLC C m TTCCACCCCC AAGCCTAAAG AACATCGCCC GCCTCGCaGC GGCAAGAATG CAGCAAGaGC CAACTCCACC GO C AAGCACC AOCTCTCCGC 
S401 ACGAGTCCCT TCACLIIILI TTTGATGGG6 TATCTATaTC CTTCCCATCC LllllLGACG CaCaGATCCC CCOCTTCGCA GCCCCaCAAC CCCCGCCAAG TACATGCCCT ACCCATCTOC 
Snt CTATCTCnr CCCATCCnr TCCCACGGaG AGATTCAGCA GnGACCCCC ACACTAACCC ACTCCGACCC CUlLLlLlll GCCTCATITO AACCGOGCGA aCTGAACTCA ATTATATCCT 
J64I CCCGATCAGC CCTATCTTTT CCACCACCCA ACCaGAGaCC TaGACGCaGG ACCACGaGCA CCCAATACTC TCTAACCCCC CTACCTCCCT ACaTATTTTC CaCCGACACA GGCCCTCGGC 
5761 ACTTCCAAAA GAACTCCCTT CTCCACAaCC ACCTTACACA aCCCaCCTTC CaCCCCAaTC IT C T CC aaaC aATCTACCCC CCCCTCCTCC acacctccaa acaccaacac ctcaaactca 
mi CCTACCACAT CAT6CCCACC caa ccc aaca aaaccaocta ccagtctcca aaactacaaa accacaaacc cataaccact cagccactcc TTTCaCCCCT acccctctat AACTCTOOCA 

6001 CAGaTCACCC ACAATCCTAT AACaTCaCCT ACCCCAAACC ATCCTATTCC ACCAGTCTAC CACCCAACTA CTCTCaCCCA AACTTTCCTC TAC L.LIHL TAACAACTAT ctccatgaga 
4121 ATTACCCCAC OCTACCATCT TATCACATCA CCCACCACTA CCATOCTTAC TTCCATATOC TaCaCCCCAC ACTCCCTTCC CTACATACTC CAA LIUHL CCCCCCCAAG CTTACAACTT 
6241 ACCCCAAAAG aCACCACTAT AGACCCCCAA ACATCCGCAC TCCGCTTCCA TCAGCGATCC ACAACACCIT CCAAAACCrC CTCATTCCCG CCaCTAAAAG aaactccaac ctcacacaaa 
6361 TCCCTCAACT CCCAACACTC CACTCACCGA CATTCAACCT TCAATCCTTT CCAAAATATC CATCCAATCA CCACTATTCG CACCACTTTC CCCCAAACCC AATTACCATC ACTACTCACT 
6ai TCCTTACCCC ATaCGRSCCC AGACTCAAAC GCCCTAAGGC CGCCCCACTC TTCCCAAAGA COCATAATTT CCTCCCATTS CAAGAACTCC CTATCGATAC AWLCILA TC CACATCAAAA 
6601 GAGACGTCAA ACTTACACCT GGCA C GAAAC ACACAGAAGA AACACCCAAA CTACAACTCA TACAACCCCC ACAACCCCTC GCCACCOCrr ACCTATCCGC GATCCACCGG CACTTaCTCC 



F^G5A 
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6711 CCACCCTTAC ACCdCTTTTTC CTACCCAACA TTCACACCCT CTTTCaCaTO TCCCCCCaCC ACTTTCaTCC AATCATACCA CAACACnTA ACCAAGCTCA CCCCCTACTC GAGACGQATA 
6841 TCGCCTCCTT CCACAAAAOC CAACaCCACG CTATCCCCTT AACCCCCCTC ATCATCTTCC AaCaCCTCCC TCTCCaCCAA CCACTACTCG ACTTCATCGA CTOCCCCTTT GGAGAAATAT 
6961 CATCCACCCA TCTCCCCACC GCTACCCCTT TCAAATTCCC CCCCATCATC AAATCCCCaa TCTTCCTCAC C Ll(-iHL.tL AACACAGTTC TCAATCTCCT TATCGCCACC AGACTArrCQ 
TMI ACGACCCGCr TAAAACCTCC AAATCTDCaO CATTTATCCG CC A CCACAAC ATTATACACC CACTACTATC TCaCAAACAA atccctgaca gcttctgccac creccTCAAC atcgacctta 
7MI ACATCATTCA CCCACICATC CCCGACACAC CACCTTACTT CTCCGCTSGA TTCATCTTCC AACATICCCr TACCTCCACA CCCTCrCCCG TCOCGGACCC CTRSAAAACG CTUTmACr 
7311 TCCCTAAACC GCTCCCACCC CaCCaTCACC AACACCAAGA CAGAAGACCC OCT tl OCT AC ATCAAACAaa GCCGTCCnT AGAOTAGOTA TAACAGaCAC CTTAOCACTO CCCCTOOCAA 
7441 CTCCCrATGA CCTACACAAC ATCACACCTC TCCTCCTCCC ATTCAGAACT TTTOCCCACA CCAAAACAGC ATTTCAACCC ATCACAOCOO AAATAAACCA I L I LlA CCCr CXJTtrCTAAAT 
7361 ACPCAGCATA CTACATTTCA TCTOACTAAT ACCaCAACAC CACCACCATG AATACACGAT TCTTTAACAT CCTCGGCCCC CCCCCCITCC CAGCCCCCAC TCCCATCTCO ACGCCCCGGA 
76It GAACCACGCA OOCCCCCCCC ATCCCTCCCC CCAATCCOCT CCCTTCCCAA ATCCAGCAAC TCACCACaCC CCTCACTGCC CTACTCATTC CACACCCAAC TACACCTCAA ACCCCACCCC 
7101 CACCCCCCCC CCCCCCCCAC AAGAACCACC CCCCAAACCA ACCACCCAAC CCCAAGAAAC CAAAAACACA GCAGAACAaC AACAAGCAAC CTCCAAAACC CAAACCCGCA aagagacagc 
7721 CTATCOCACT TAACTTCCAC CCCCACAGAC TCTTCCACCT CAAAAATCAC CACCGACaTC TCATCCCCCA CGCACTOCCC ATCCAACCAA ACGTAATCAA ACCACTCCAC CTCAAAGGAA 
1041 CTATTCACCA CCCTCTCCTA TCAAACCTCA AATTCACCAA CTCCTCAGCA TACCACATCC ACTTCCCACA CTTCCCCCTC AACATSAGAA CTtSACCCCrr CACCTACACC ACTCAACACC 
1161 CTCAAGCCrr CTACAACTCG CACCACCCAC CCCTCCACTA TACTCGACCC ACATTTACCA TCCCCCCCGG ACrACOACCC ACACOAGACA CTGGTCCrrCC GATTATBGAT AACICAGCCC 
gut GCCTTOTCCC CATAGTCC1C CCACCCGCTC ATCACCCAaC AAGAACCCCC Li I ILGCTCC TCACCRSCAA TACCAAAGCG AAGACAATCA ACACAACCCC CCAAGCCACA GAAGACTCCT 
•401 CTGCroCACC ACrCGTCACC GCCATCTOCT TSC1TCGAAA CGTGAGCTTC CCATCCAATC CCCCCCCCAC ATCCTACACC CGCGAACCAT CCAGAGCTCT CCACATCCTC CAACACAACO 
tS21 TCAACCACCA CGCCTACGAC ACCCTSCTCA aCGCCATATT CCOCreCGGA TCCreCCOCA GAACTAAAAC AACCOTCACT GACGACTTTA CCrrCACCAG CCCGTACntl CCCACATCCr 
1641 CCTACTUrCA CCATACTCAA CCCTCCTTTA GCCCCATTAA CATCCAGCAC CTCTSCCATC A A CCCCACGA CAACaCCATA CGCATACACA CTTCCCCCCA CTTrCCATAC GACCAAACCC 
not C AGCAGCAAC CXCAAATAAC TACCCCTACA TCTCOCTCGA GCAGGATCAT ACTOXCAAAO AAGGCACCAT CGATCACATC AAGATCAGCA CCTCAGGACC CTOrAGAAGO CTTAOCTACA 
list AACGATACrr TCTCCTCCCG AAGTCTCCIC CAGGCGACAG CGTAACOGIT AGCATACCGA CTACCAACTC AGCAACCTCA TCCACAATCG CCCGCAACAT AAAACCAAAA rrCGXGCCAC 
9001 C GCAAAAA T A TSACCTACCT CCCCTTCACO OTAAGAAGAT TCCITOCACA CnSTACGACC CTCTCAAAGA AAGAACCCCC GOCTACATCA CTATCSCACAG GCCGGGACCC CACCCCTATA 
9IZI CATCCTATCr CCAGGAATCA TCAGCCAAAC TTTACCCGAA GCCACCATCC CGGAAGAACA TTACCrACCA GTCCAACTCC GCCCATTACA ACACCCGAAC CCITACCACC CCTACCGAAA 
9Z4I TCACCCCCTC CACCCCCATC AACCACTCCS TCOCCTATAA CACCGACCAA ACCAACTCCC tcttcaactc cccccactcc atcacacacg cccaccacac cccccaacgg aaattccatt 

9361 TCCCTTTCAA CCTCATCCCC ACTACCTCCA TCCTCCCTCT TGCCCACCCC CCGAACCTAG TACACOCCTT TAAACACATC ACCCTCCAAT TACaCaCACA CCATCTCaCA TTOCTCACCA 
9««rCCACCACACr ACCCCCAAAC ccccaaccaa ccactcaatc gatcatccca AACACCCTTA CAAACTTCAC cctccaccca catcccctcg AATACATATC CCCCAATCAC CAACCAGTAA 
9601 CCCrCTATCC CCAACACTTCT CCACCACCaC ACCCTCACCC ATCCCCACaC GAAATAGTAC AGCATTACTA TCATCCCCAT CCTCTCTACA CCATUTTAGC CCTCCCATICA CCTCCnTTCC 
9721 CCATGATCAT TCCCtTTAACT CTTCCACCAT TATCTCCCTC TAAACCCCCC CCTCACTGCC TCACCCCATA TCCCCTCCCC CCAAATCCCC TCATICCAAC TTCCCTCCCA CTTTTtrrOCr 
9841 GTCrrACCTC CGCTAATCCT CAAACATTCA CCGaGACCAT GAGTTACTTA TCCXCGAACA CCCaGCCCTT CTTCTGGCrC CAGCrCTUTA TACCTCTCGC CCCTGTCCTC CnCTAATCC 
9MI CCTCTTOCTC AT5CTCCCTG CCITTnTAG TCbllULLGC CGCCTACCTC GCCAA O CfTAG ACCCCTACGA ACATOCGACC ACmTCCAA Al CT CCCA CA GATACCCTAT AAGGCACnC 
lOOil TTCAAACCCC ACGGTACCCC CCCCTCAATT TSGAGATTAC TCTCATCTTCC TCGGAGGTTT TSCCTTCCAC CAACCAA C aO TACATTACCT GCAAATTCAC CACTCTCGIC C CC T CCCCIA 
10201 AAGTCACATC CTCCGGCTCC TTOGAATCTC AGCCCGCCCC TCA C GCA G A C TATACCTCCA AGCTCTTTGG AOGGGTUTAC CCCTTCATBT CCGGACGAPC ACAATCTTTT TCCGACACTC 
103X1 ACAACACCCA GATCACTGAC CCCTACGTCG AATTCTCACT AGATTOCCCC ACICMCCACC CCCAGGCCAT TAAGCICCAT AC ICCCCC CA TCAAACTAOO ACTGCCTATA CTCrACGCCA 
10*41 ACACTACCAC TTTCCTACAT CTCrACCTGA ACGGACTCAC A C CAGGAACO TCTAAAGACC TCAAAGICaT AGCraOACCA ATTTCAGCAT lUlllACACC ATTCCATCAC AAOilLUIlA 
10561 TCA A TCG CCO CCTGCTCTAC AACTATGACT TTCCCGAATA C GCACCCATO AAA C CAGGaC CCTTTCGAGA CATTCAAGCT ACCTCCTTCA CTAGCAAACA CCTCATCCCC A C CACACACA 
10681 TTACCCTACT CAAGCCTTCC CCCAAGAACG TCCATCTCCC CTACACGCAG GCCCCATCTC CATTCCAGAT CrOGAAAAAC AACrCAGCCC CCCCACTCCA GGAAA C CCCe CCTTTTQOCr 
lOm GCAAGATTGC ACTCAATCCG CliCtA OCGG TCCACTCCTC ATACOCGAAC ATTCCCATTT CTATTUACAT CCCGAACCCT CCCZTrATCA GGACATCACA TCCACCACTQ GICTCAACAC 
10911 TCAAATCTCA TCTCACTCAC TGCAdTATT CAGCGGACTT CGGAGGGATG GCTACCCTCC AGTATUTATC CGACCGCGAA GGACAATCCC CTCTACATTC OCATTCGACC AC A CCAA C CC 
1 1041 TCCAAGACrC CACACTTCAT CTCCTCCaCA AACCACCCCr GACACTACAC TTCACCACCC CCaCCCCACA CGCCAACTTC ATTCTATCCC TCrrCTCGTAA OAAGACAACA TOCAATQCAG 
It 161 AATCCAAACC ACCAGCTCAT CATATCCTCA CCACCCCCCA CAAAAATCAC CAAGAATTCC AACCCCCCAT CTCAAAAACT TCaTCCACTT CGCTtTrrrCC CCTTTTCCGC GGCGCCTCCT 
11221 CCCTATTAAT TaTaCCaCTT ATCATTTTTC CTTCCACCAT CaTCCTCaCT ACCACaCCaa CATCACCCCT ACCCCCCAAT CACCCGACCA gcaaaactcg atctacttcc gacgaactca 
M40t TX7TCCATAAT CCATCACCCT CCTATATTAC ATCCCCCCTT ACCGCCCGCA ATATAGCAAC ACCAAAACTC CACCrAmU CCACCAACCC CACTGCATAA TGCTGCGCAG lUllLCCAAA 
» I JIl TAATCACTAT ATTAaCCATT TATTCACCCG ACGCCAAAAC TCAATGTATT TCTCAGGAAC CATCCTCCAT AATGCCATCC AGCCrcrOCA TAACTTTTTA TTAlllLlU TATTAATCAA 
11641 CAAAATTTTC i HI IA ACAT TTC 
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Nucleotide Sequence of TR339 

1 ATTOCCCCCC TACTACACAC TATTCAATCA AACACCCCAC CAATTCXrACT ACCATCACAA TCX]JtfUAGCC ACTA^ GTAGACCrAG ACCCCCAOAO ILLbll IQIC CTCCAACTOC 

121 AAAAAAOCrr CCCCCAATTT CACCTACTAC CACACCAGCT CACTCCAAAT caccatocta atocxacacc attttcccat ctgoccacta aactaatcoa octccmggtt cctaccacag 

241 CCACCATCrr CCACATAGGC ACX»CACCC6 CTCGTAOAAT CTITrCCCAC CACOUJTATC ATPOTtrFCTO CCCCATBCCST ACICCACAAC ACCCCOACCC CATOATCAAA TATtJCCACTA 
361 AACTQOCGCA AAAAfiCCTCC AAGATTACAA ACAACAACTT OCATGAGAAG ATTAAGQATC TCCCCACCGT ACITOATACO CCGGATQCTO AAAOCCATC U,lUU:ur CaCAACOATC 
ai TTACCICCAA C ATOCgTGCC CAATATPCCG TCATOCACCA CCTOTATATC AACCCTtXX» CAACTATCTA TCATCACOCT ATOAAAOOCC ■TOCOOACCCT CTACTOOATT OCCTTCOACA 
«M CCACCCAGTT CATCTTTCTCG OCTATBOCAO CTTCCrACCC TCCCTACAAC ACCAACTOCG CCCACOAGAA AGTCCTTGAA OCOCCTAACA TCGCACTTTC CAGCACAAAO CTOACTOAAO 
T2I GTAGCACACO AAAATTOTCG ATAATQAOCA AGAACGAGTT CAAOCeeOOC TCCCOOCgTTT AI I tLIUtUI AGCATCCACA CTTTATCCAC AACACACACC CAOCTTOCAO ACXITGCCATC 
Ml TTCCATCCCr CrTCCACTTQ AATOCAAAOC AOrCOTACAC TICCCCCTOT GATACACrSG TCACTTOCGA ACCCTACCTA CTCAACAAAA TCACCATCAO TC C CCC C ATC ArfTOOAOAAA 
Ml CCGTCOSATA CCCCCmCA CACAATACCG ACCU.I1L11 CCTATCCAAA CTTACTCACA CAGTAAAAGG ACAACGOGTA TCGmXCTO TCTCCACGTA CATCCCOOCC ACCATATOCC 
iOtl ATCACATCAC TOGTATAATO GCCACCCATA TATCACCTCA CCATOCACAA AAACTTCTCG TTCCCCTCAA CCAGCCAATT CTCATTAACC CTAOGACTAA CACCAACACC AACACCATCC 
1301 AAAATTACCT TCTOCCCATC ATACCACAAG CCTTCAGCAA ATCGGCTAAG GAOCCCAAOC ATCATCTTOA TAACGACAAA ATGCTOOOTA CTAGACAACC CAAGCTTACG TATOOCTCCT 
13J1 ■nTTGOGCCTT TCCCACTAAG AAAGTACATT CCTTTTATCO CCCACCTCGA ACCCAGACCA TCGTAAAACT CCCAGCCTCT TTTAOCOCTT TTCCCATOTC GTCCGTATOO ACGAGCICTT 
1441 TCCCCATCTt: GCTGACCCAG AAATTGAAAC TGCCATTOCA ACCAAAGAAO CACGAAAAAC TCCTCCAGCT CTCGGAOCAA TTACTCATCO AOOCCAAOOC TOCTTTTCAC GAIGCICACG 

IJ6I A G GAAcccAG aocgcagaag cfccgagaag cacttccacc attagtcgca gacaaaggca tcgacocaoc cgcagaacit gictgcgaag toqaooogct rTAfJtmue AI 



IMI CATTACnGA AACCCCGCGC GGTCACGTAA GGATAATACC TCAACCAAAT GACCGTATSA TCOCACACrA TATCCTTCfTC TCCCCAAACT CTTmJCTXtAA GAAlOCCAAA C1C0CACCAC 
1101 CCCACCCCCT ACCAGATCAO GrTAAOATCA TAACACACTC CCOrAOATCA GGAAGGTACG COCTCGAACC ATACGACGCT AAACTACTGA TCCCAOCACC AGGXGCCGTA CCATC0CCA6 
1921 AATHrCTACC ACrOAOItMa AOCOCCACCT TACIUTACAA CCAAACA O AG TTTCTQAACC GCAAACTATA CCACATTGCC ATQCATCOCC CCGCCAACAA TACAOAAGAG GAOCAOTACA 
2041 ACGTTACAAA GCCAGAGCTT OCAGAAACAG ACTALUIUII TCACCTOCAC AAGAAGCCTT OCCTTAAGAA GGAAGAAOCC TCAG GIClUi ILLILILUAi AGAACTCACC VLCCfT Ct; C I 
2M1 ATCATOAGCT ACCTCTC6AG GCACTCAAGA CCCGACCTOC 0G1CCCGTAC AAOCICGAAA CAATAGGAGT GATACGCACA CCGGGGTCGG OCAAQTCAOC TATTATCAAG TCAACTCTCA 
at! CGOCACGGGA TOTCTTACC AGCGGAAAGA AAGAAAATTG TCGCQAAATT GAGGCCCACG TCCTAACACr GAGGGGTATO CAGATrACOT CGAAGACAGT AGATTCGGIT ATCCTCAACC 
2401 GATCCCACAA AGCCGTAGAA UIU.IUIA CG TTOACGAAOC Cr TCGC C I O C CACGCAGOAC CACTACTTCC CrrCATTCCT ATCCTCAOCC CCCGCAAOAA GGTAGTACTA TOXGACACC 
ani CCATOCAATG CCGATTCTTC AACATQATCC AACTAAACGT ACATTTCAAT CACCCTGAAA AACACATATC CACCAAGACA TTCTACAACT ATATCTCCCO OCCTTOCACA CAOCCACTTA 
2«4I CAGCTATTGr ATCOACACTO CATTACCATO OAAACATOAA AACCACGAAC CCGnJCAACA ACAACATTCA AATCGaTATT ACAGGGCCCA CAAACCCCAA GCCACCGGAT ATCA TCC T C A 
TMl CATOnrCCG CCCCTOCCTT AACCAATTCC AAATCCACTA TCCCOCACAT CAACTAATCA CACCCGCOGC CTCACAAGOG CTAACCAGAA AAOGACrGTA TOC CCTCCOO CAAAAAGTCA 
2«l ATCAAAACCC ACTUrACCCG ATCACATCAC ACCATCTCAA CCTCTTOCTC ACCCCCACTG ACGACACCCT AGTCTCCaAA ACCTTCCAGC CCGACCCATG GATTAAGCAG CTCACTAACA 
3001 TACCTAAACG AAACnrCAC CCTACTATAO ACGACTOCGA ACCTGAACAC AACCGAATAA TTCCTCCAAT AAACACCCCC ACTCCCCCTO CCAATCCCTT CAGCTCCAAG ACCAACOTTT 
3121 OCrCGCCGAA ACCATTGCAA CCCATACTAC CCACCCCCCG TATCCTACTT ACCCCTTCCC AGTCCAGCGA ACTCTTCCCA CAGTTTCCOG ATGACAAACC ACATTCCGCC ATTTA CGC Cr 
3241 TAGACGTAAT TTGCATTAAC 11 1 11I.0CCA TCCACTTCAC AAGCCCACTG TTTTCTAAAC ACAGCATCCC ACTAACGTAC CATCCOCCCC ATICAGCGAG GCCGGTACCT CATTOOGACA 
33«l ACACCCCAGG AACCCCCAAO TATCOCTACC ATCACCCCAT TCCCGCCCAA CTCTCCCCTA GATTTCCGGT GTTCCAOCTA GCTOGGAAGC GCACACAACr TCATITGCAQ ACOOGOACAA 
34«1 CCACACTTAT CTCTSCACAG CATAACCTGG TCCCGCTCAA CCGCAATCTT CCTCACGCCTTAGTCCCCGA OTACAAGGAG AAGCAACCCC OCCCGGICGA AAAAITCTTG AACCAOITCA 
3«01 AACACCACTC ACrACrraTO GTATCACAGG AAAAAATTCA ACCTIXCCCT AAGAGAAtCG AATOGATCGC CCCGATTGGC ATA GCCO gl O CAGATAAGAA CTACAACCTO tiLUIUJOCH 
3721 TTCCGCCCCA OCCACGOrAC GACCTOOrer TCATCAACAT TOGAACTAAA TACAGAAACC ACrACTTTCA GCACTGCGAA GACCATOCOG CGACCTTAAA AACCCTTTCG C g nilLULLL 
M4I TCAATIGCCT TAACCCAOGA OGCACCCTCC TCCTGAAGTC CTATOOCTAC GCCGACCGCA ACACIGAGGA CGTAGTCACC GCTCTTCCCA GAAAGIlTCr CAGGGTCTCC CCAOCGACAC 
mi CAGATTBIUr CTCAAGCAAT ACAGAAATCT ACCIGATTTT CCGACAACTA GACAACAOCC GTACACGGCA ATTCACCCCG CACCATCTUA ATTOCOTGAT l iLLILLGIU TATGACaGTA 
*Otl CAaGAGATCG AGTTGGAGCC GC GCCGICA T ACCGCACCAA AAGGOAGAAT ATTOCTBACr CTCAAGAOGA ACCA GIimC AACGCAGCCA ATCCGCTCCO TAGACCACGC GAAGGAGICT 
*20l OCCCroCCAT CTATAAACCT TCQCCCACCA GTTTTACCGA TICAGCCACG GAGACAGGCA CCGCAAGAAT GACIUTCTOC CTAGGAAAGA AACXGATCCA CGCCGTCOGC CCTCATTTCC 
4321 GGAAGCACCC AGAAGCAGAA GCCTPOAAAT TCCTACAAAA CGCCTACCAT GCACIGGCAG ACTTAGTAAA TGAACATAAC ATCAAGTCTC TCGCCATTCC ACroCTATCT ACAOGCA1TT 
«*! ACGCAGCCGG AAAAGACCGC CITQAAGrAT CACITAACTG CTTGACAACC CCGCTACaCA CAACTGACCC CCACCTAACC ATCTATTGCC TCCATAAGAA CTTCCAACCAA ACAATCCACC 
4561 CCOCACTCCA ACTTAAGGAG TCTGTAACAG AGCTGAACGA TCAAGATATC GACATCCACG ATCACTTACT ATCGATCCAT ccacacactt CCTTCAACOG AAGAAACGGa TTCAGTACTA 
4611 CAAAACCAAA ATTCrATTCG TACTTCGAAC CCACCAAATT CCATCAAOCA CCAAAACaCA TCCCCCACAT AAACCTCCTC TTCCCTAATC ACCACCAAAG TAATCAACAA LiUiUlU-Ll * 
4101 ACATATTCGG tcacaccatc GAACCAATCC GCGAAAACTC cccgctccac CATAACCCCT CCTCTAOCCC OCCCAAAACG TTCCCCTCCC TTTCCATCTA TCCCATCACC CCACAAACCC 
49X1 TCCACACACT TAGAAOCAAT AACGTCAAAC AACTTACACT ATOCTCCTCC ACCCCCCTTC CTAACCAGAA AA1TAACAAT GTTCACAACC TTCACTCCAC CAAACTACTC CTCTTTAATC 
5041 CGCACACTCC CCCATTCCTTT CCCOCCCCTTA ACTACATACA ACTOCCACAA CACCCTACCG CTCCTCCTCC ACaCCCCGAG GACGCCCCCG AACTTCTAGC GACACCCTCA CCATCTACAC 
5161 CTCATAACAC CTCCCTTCAT GTCACAGaCA TCTCaCTCCA TATCCATCAC AGTAOCCAAO CCrCACTTTT TTCCACCTTT AGCGGATCCC ACAACTCTAT TACTACTATO CACACTTCCT 
5211 CCTCAGCACC TACTTCACTA GAGATAGTAG ACCCAACCCA CGTCGTGGTG GCTGACGTTC ATGCCGTCCA aCaCCCTCCC CCTATTCCAC CGCCAAGGCT AAACAaCaTO CCCCGCCTCG 
*<0l CACCCCCAAC AAAACAGCCC ACICCACCCG CAAGCAATAG CTCTCACTCC CTCCACCTCT CITTTGGrCC GGTATCCATO TCCCTCGGAT CAATTTTCCA CCGAGACACG CCCCC CC AGG 
5521 CAGCGCTACA ACCCCTGGCA ACAGOCCCCA CGCATGTCCC TAIUILIIIC OGATCCTTTT CCGACGGAGA GATTCATGAG CTGAGCCGCA GAGTAACIVA OTCCGAACCC OILLIUHIU 
5641 GATCATTTQA ACCGCOCGAA GICAACTCAA TTATATCCTC CCCATCAGCC CT AlLlllli. CACTACGCAA GCACAGACCT AGACGCACCA GCAOGAGCAC ICAATACrCA CTAACCCCGG 
SM TACCTCCCrA CATAllllLli A C COACACAO GCCCTGGGCA CTfGCAAAAG AAGTCCCTTC TOCACAACCA GCTTACAGAA CCGACCTTCG AGCGCAATGT CCTCGAAAGA ATTCATOCCC 
SUl COOTCCTC6A CACCTCGAAA CAOOAACAA C TCAAACTCAG GTACCaGATO A TC C C CA CCG AAGCCAACAA AAGTAGGTAC CACTCTCCTA AAGTACAAAA TCAGAAAGCC ATAACCACrC 
6001 AOCGACTACr CTCAGGACTA CQACTGrATA ACTCTCCCAC ACATCACCCA GAATOCTATA ACATCACCTA TCCGAAACCA TrGTACTCCA CTAGCCTACC CGCGAACTAC TCCGATCCAC 
6121 AGTTCGCnn' AGCIUICTGr AACAACTATC TOCATCAGAA CTATCCGACA CTAGCATCTT ATCAGATTAC TGACGACTAC GATOCTTACT TCGATATGCT AGA C CG C ACA CTCCCCTCCC 
61*1 TCGATACTGC AACCTTCTCC CCCGClAAGC TTACAACTTA CCCGAAAA A A CATCAGTATA GAGCCCCCAA TATCCCCACT CCCGTTCCAT CAGCGATCCA GAACACCCTA GAAAATGTGC 
6361 TCATTCCCOC AACTAAAAGA AATTGCAACG TCACGCAGAT GCGTCAACTG CCAACACTGG ACTCACCCAC ATTCAATCTC CAATCCTTTC CAAAATATGC ATGTAATCAC CACTATTGGG 
6ai ACGACTTCCC TCCCAACCCA ATTAOCATTA CCACTCACTT TUTCACCCCA TATCTACCTA GACTGAAACG CCCTAAGGCC GCCCCACTAT TTCCAAACaC CTATAATnC GTCCCATTCC 
6601 aacaagpqcc tatcgataga ttccicatgc acatcaaaao acacctgaaa cttacaccag CCACCAAACA CACAGAACAA AGACCCAAAG TaCAACTCAT ACAAGCCCCA GAACCCCTCC 
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6711 CCACrCCTTA CTTATCCGCG ATTCACCGCO AAlTACfGCG TACCCTTACO GCCOTCfTGC TTCCAAACAT TCACACCCTT TTTOACATCT COCCCGAGGA TTTTCATCCA ATCATACCAC 
6MI AACACTTCAA GCAAGCCGAC CCCCTACTOC AGACGCATAT CCCATCATTC CACAAAACCC AACACGACCC TATCOCOITA ACCGCTCTGA TCATCTTGGA CCACCTGCOr CroOATCAAC 
6061 CACTACTCCA CTTCATCCAG TOCCCCTTTC CACAAATATC ATCCACCCAT CTACCTACCC CTACTCCTTT TAAATTCCCC CCGATCATCA AATCCCGAAT CiUtLILA CA CTTTTTCTCA 
TQtl ACACACnrr CAATCTCCTT ATCCCCACCA CAGTACTACA ACACCCCCTT AAAACCTCCA GATUTGCAGC CrrCATTCCC CACCACAACA TCATACATCC ACTACTATCT GACAAACAAA 
7101 TCCCTCACAO CTCCCCCACC TCCCTCAACA TCGACCTTAA CATCATCGAC CCACTCATCC CTGAGAGACC ACCTTACTTC TCCCOCCCAT TTATCTTCCA AGATrCCOIT ACnCCACAO 

7321 ccrccccccr cccgcacccc ctcaaaagcc tutttaactt ccgtaaaccc cttccaccco accacgacca agacgaagac acaagaccco CTcrocTACA TCAAACAAAO GccnsonTA 

7441 GACTAGGTAT AACAOGCACT TTACCACTCG CCGTCACCAC CCCCTATGAO GTAGACAATA TTACACCTCT CCTACTOOCA TTCACAACTT TTCCCCACAG rAAAAQAOCA TTCCAAGCCA 

7MI tcacaoocca aataaagcat ciCTACCGrG crccTAAATA G1CAOCATAO tacatttcat ctcactaata ctacaacacc accaocatga atacacgatt ctttaacato c iTq wc p g c 

7611 CCCCCTTCCC CCCCCCCACT CCCATOTCCA GCCCOCGCAO AAOOACCCAO CCGCCCCCCA TCCCTCCCCC CAACGGCCTC CCTICTCAAA TCCACCAACT GACCACACCC GTCACIOCCC 
7801 TACTCATTCG ACAGCCAACT ACACCTCAAC CCCCACCTCC ACCCCCCCCA CCCCGCCAOA A C AA C C A CCC CCCCAAGCAA CCACCCAAGC COAAOAAACC AAAAACCCAG OAOAACAACA 
mi ACAAGCAACC TOCAAAACCC AAACC C COAA ACAOACACCC CATCCCACTT AAGTrCGACG CCCACACATT CTTCCAOnC AAOAACCACO ACGGAGATGT CATCCCCCAC GCACTCCCCA 
8041 TCGAACGAAA GCTAATOAAA CCTCTCCACG TCAAACGAAC CATCCACCAC CCTCTCCTAT CAAAGCTCAA ATTTACCAAO TgOICACCAT ACGACATCCA GTTCCCACAQ TTCCCACICA 
SIM ACATQAOAAO TaAGCCATTC ACCTACACCA CTCAACACCC CGAAGCATTC TATAACTCCC ACCACGGAGC CCTCCACTAT ACTGGAGGTA GATTTACCAT CCCTCCCCGA CTAGCAGCCA 
nil GACGAGACAG CG GICG T CCG ATCATCGATA ACTCCGCTCG GGTRnCGCG ATACTCCTCO CTCGAOCTGA TCAAGCAACA CCAACTCCCC llltOOlLgi CACCTCGAAT ACTAAACGGA 
•401 AGACAATTAA CACGACCCCG CAACCGACAC AAGACTCCTC CGCAOCACCA CTCCTCACCO CA AIOIb l U CCICCCAAAT GTCACCTTCC CATCCGACCG CCCCCCCACA TSCTATACCC 
mi GCCAACCTTC CAGACCCCTC CACATCCTTG AAGACAACCT CAACCATCAG CCCTACGATA CCCTCCTCAA TGCCATATTO CGCTCCCCAT CCTUtCCCAG AAGCAAAAGA AGCCICACTG 
8641 ACQACTTTAC CCTCACCACC CCCTACTTCC CCACATCCTC GTACTCCCAC CATACXGAAC CCTGCTTCAG CCCTCTTAAG ATCGACCACG TCTCCOACGA ACCCGACGAT AACACCATAC 
1761 CCATACACAC TTCCCCCCAG TTTGGATACG ACCAAACCCG ACCACCAACC CCAAACAACT ACCCCTACAT CTCCCTTCAC CACGATCACA CCCTTAAACA AGCCACCATQ GATCACATCA 
1881 AGATTAGCAC CTCACGACCC TCTAGAACGC TTACCTACAA AGCATACTTT CTCCTCGCAA AATCCCCTCC AGGGQACACC CTAACCGTTA GCATAGTCAO TACCAACTCA CCAACCrCAT 
9001 OTACACTXSGC CCCCAAGATA AAACCAAAAT TCGTCCGACC CGAAAAATAT CATUrACCTC CCCTTCACCG TAAAAAAATT CCTTCCACAG TCTACOACCG TCTCAAAOAA ACAACTGCAG 
9l2t GCTACATCAC TATCCACACG CCCGCACCCC ACGCHATAC ATCCTACCTO GAAOAATCAT CACGGAAAGr TTACOCAAAG CCCCCATCTG CGAAOAACAT TAGGTATOAG TCCAAC7ICCG 
9241 CCCACTACAA GACCGCAACC CTTTCCACCC CCACCGAAAT CA CIUiliUC ACCCCCATCA ACCACTSCCT CCCCTATAAG ACCGACCAAA CGAAGIG CO T CTTCAACTCA CCGGACTTCA 
9361 TCACACATCA CGACCACACG CCCCAAGGCA AATICCATTT CCLIllLAAG nOATCCCCA GTACCTCCAT CCTCCCTCTT CCCCACCCCC CGAATOTAAT ACATOOCTTT AAACACATCA 
9ai OCCTCCAATT AGATACAGAC CACTTGACAT TCCTCACCAC CAGGACACTA CCOCCAAACC CGCAACCAAC CACTCAATCC ATCCTCCGAA AGACGCTCAC AAACTXCACC OTCGACCGAO 
9601 ATCCCCTGGA ATACATATCO CCAAATCATC ACCCAGTCAG GGTCTATGCC CAAGAOTCAG CACCAGGACA CCCTCACCGA TCCCCACACO AAATACTACA GCATTACTAC CATCOCCATC 
9721 CfOTOIA CAC CATCTTACCC GICCCATCAG CTACCCTCCC GATQATCATr CGCCTAACCG TTCCACTOTT ATSTGCCTCT AAA C OCCG C C CTTQAGTCCCr GACCCCATAC GCCCTCCCCC 
9841 CAAACCCCGT AATCCCAACT TCGCTOGCAC TdTCTCCTG CCTTAOCTCG CCCAATCCTG AAACGTTCAC CGAGACCATG ACTTALllUl CCTCCAACAG TCACCCCTTC TTCTOOCTCC 
9961 AUI lOIUCAT ACCTTTCCCC GCTTTCATCG TTCTAATCCO CTCCTCCTCC TCCTCCCTCC CTTTTTTACT CGTTCCCCCC GCCTACCTCC CGAACCTACA CCCCTACCAA CATOCCACCA 
Unil CTOTTCCAAA TQTOCCACAO ATACCCTATA ACCCACTTCT TQAAACGCCA GGCTATCCCC CGCICAATTT CCAGATCACT CICATCTCCT CCGAOGTTTT GCCTTCCACC AACCAACACT 
10201 ACATTACCTC CAAATTCACC ACTCTCCTCC CCTCCCCAAA AATCAAATCC TCCGCCTCCT TCGAATCTCA GCC COC CG C T CATCCACACT ATACCTCCAA GGTCTTCGGA CG CG TCT A CC 
lOni CCTTTATCTC GCGAGGAGCG CAATCTTTTT GCCACAC7GA GAACACCCAC ATCACTCACC CCTACCTCGA ACTUTCAGCA GATTCCCCGT CTCACCACGC CCACCCGATT AACCTCCACA 
10441 CTCCCCCCAT CAAAGTAGCA CTUCCTATAC TCTACGCCAA CACTACCACT TTCCTAGATO TCTACCTCAA CCGACTCACA CCACCAACCT CTAAAGaCTT CAAACTCATA CCTOCA C CAA 
10S6] TTTCAGCATC GnTACCCCA TTCCATCATA ACCTCCTTAT CCATCGCGGC LIUOlblA CA ACTATGACTT CCCGGAATAT GCAGCGATGA AACCACCACC GTITCGAGAC ATTCAAOCTA 
t06»l CCTCCTTCAC TAGCAAGGAT CTCATCCCCA GCACAGACAT TACCCTACTC AAGCCTTCCC CCAAGAACCT GCATCTCCCC TACACOCAGO CCGCATCAGG ATTTGAGATG TCGAAAAACA 
lOaot ACTCACGCCG CCCACTCCAG GAAACCGCAC CTTTCGCCrO TAAGATTCCA GTAAATCCGC TCCGAGCGGT GGACTGTTCA TACGGGAACA TTCCCATTTC TATTCACATC CCCAACCCTQ 
10921 CCTTTATCAG CACAICACAT GCACCACTCG TCTCAACAGT CAAATCTGAA CICAGTCACr CCACTTATTC AGCACACTTC CGCGGGATGC CCACCCTGCA CTATCTATCC GACCCCCAAC 
noil GTCAATBCCC CCTACAITCG CATTCCACCA CACCAACTCT CCAAGACTCG ACAGTACATG TCCTOGAOAA AGGAGCCCTC ACAGTACACT TTACCACCGC CAGTCCACAO CCGAAdTTA 
11161 TCGTATCGCT CTGtCGGAAG AAGACAACAT CCAATGCAGA ATCTAAACCA CCAGCTCACC ATATCCTCAG CACCCCGCAC AAAAATGACC AAGAATTTCA ACCCGCCATC TC AAAAACAT 
11261 CATCGACTTG OLIblllL CC CmTCGGCC GCGCCTCCTC GCTATTAATT ATAGGACTTA TCATTTrTCC TTGCAGCATG ATOCTCACTA CCACACGAAG ATGACCGCTA CCCCCCAATG 
1 1401 ATCCGACCAG CAAAACTCGA TCTACTTCCG AGGAACTGAT GTCCATAATG CATCAGGCFC CTACATTACA TCCCCGCTTA CCCCGCGCAA TATACCAACA CTAAAAACTC CATCrACITC 
11521 CGACGAACCC CACTTCCATAA TCCTGCGCAG TGTTCCCACA TAACCACTAT ATTAACCATT TATCTACCCG ACGCCAAAAA CTCAATCTAT TTCTCACGAA GCCTCCTCCA TAATCCCACG 
11641 CACCCTCTGC ATAACmTA TTATTTCTTT TATTAATCAA CAAAATTTTG Mil l A ACAT TTC 
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